All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Support for model import from parquet file metadata.
- Great Expectation export: add optional args (#496)
suite_namethe name of the expectation suite to exportengineused to run checkssql_server_typeto define the type of SQL Server to use when engine issql
- Changelog support for
InfoandTermsblocks.
- Changelog support for custom extension keys in
ModelsandFieldsblocks.
- raise valid exception in DataContractSpecification.from_file if file does not exist
Data Contract CLI now supports the Open Data Contract Standard (ODCS) v3.0.0.
datacontract testnow also supports ODCS v3 data contract formatdatacontract export --format odcs_v3: Export to Open Data Contract Standard v3.0.0 (#460)datacontract testnow also supports ODCS v3 anda Data Contract SQL quality checks on field and model level- Support for import from Iceberg table definitions.
- Support for decimal logical type on avro export.
- Support for custom Trino types
datacontract import --format odcs: Now supports ODSC v3.0.0 files (#474)datacontract export --format odcs: Now creates v3.0.0 Open Data Contract Standard files (alias to odcs_v3). Old versions are still available as formatodcs_v2. (#460)
- fix timestamp serialization from parquet -> duckdb (#472)
datacontract export --format data-caterer: Export to Data Caterer YAML
datacontract export --format jsonschemahandle optional and nullable fields (#409)datacontract import --format unityhandle nested and complex fields (#420)datacontract import --format sparkhandle field descriptions (#420)datacontract export --format bigqueryhandle bigqueryType (#422)
- use correct float type with bigquery (#417)
- Support DATACONTRACT_MANAGER_API_KEY
- Some minor bug fixes
- Support for import of DBML Models (#379)
datacontract export --format sqlalchemy: Export to SQLAlchemy ORM models (#399)- Support of varchar max length in Glue import (#351)
datacontract publishnow also accepts theDATACONTRACT_MANAGER_API_KEYas an environment variable- Support required fields for Avro schema export (#390)
- Support data type map in Spark import and export (#408)
- Support of enum on export to avro
- Support of enum title on avro import
- Deltalake is now using DuckDB's native deltalake support (#258). Extra deltalake removed.
- When dumping to YAML (import) the alias name is used instead of the pythonic name. (#373)
- Fix an issue where the datacontract cli fails if installed without any extras (#400)
- Fix an issue where Glue database without a location creates invalid data contract (#351)
- Fix bigint -> long data type mapping (#351)
- Fix an issue where column description for Glue partition key column is ignored (#351)
- Corrected name of table parameter for bigquery import (#377)
- Fix a failed to connect to S3 Server (#384)
- Fix a model bug mismatching with the specification (
definitions.fields) (#375) - Fix array type management in Spark import (#408)
- Support data type map in Glue import. (#340)
- Basic html export for new
keysandvaluesfields - Support for recognition of 1 to 1 relationships when exporting to DBML
- Added support for arrays in JSON schema import (#305)
- Aligned JSON schema import and export of required properties
- Change dbt importer to be more robust and customizable
- Fix required field handling in JSON schema import
- Fix an issue where the quality and definition
$refare not always resolved - Fix an issue where the JSON schema validation fails for a field with type
stringand formatuuid - Fix an issue where common DBML renderers may not be able to parse parts of an exported file
- Add support for dbt manifest file (#104)
- Fix import of pyspark for type-checking when pyspark isn't required as a module (#312)
- Adds support for referencing fields within a definition (#322)
- Add
mapandenumtype for Avro schema import (#311)
- Fix import of pyspark for type-checking when pyspark isn't required as a module (#312)-
datacontract import --format spark: Import from Spark tables (#326) - Fix an issue where specifying
glue_tableas parameter did not filter the tables and instead returned all tables fromsourcedatabase (#333)
- Add support for Trino (#278)
- Spark export: add Spark StructType exporter (#277)
- add
--schemaoption for thecatalogandexportcommand to provide the schema also locally - Integrate support into the pre-commit workflow. For further details, please refer to the information provided here.
- Improved HTML export, supporting links, tags, and more
- Add support for AWS SESSION_TOKEN (#309)
- Added array management on HTML export (#299)
- Fix
datacontract import --format jsonschemawhen description is missing (#300) - Fix
datacontract testwith case-sensitive Postgres table names (#310)
datacontract servestart a local web server to provide a REST-API for the commands- Provide server for sql export for the appropriate schema (#153)
- Add struct and array management to Glue export (#271)
- Introduced optional dependencies/extras for significantly faster installation times. (#213)
- Added delta-lake as an additional optional dependency
- support
GOOGLE_APPLICATION_CREDENTIALSas variable for connecting to bigquery indatacontract test - better support bigqueries
typeattribute, don't assume all imported models are tables - added initial implementation of an importer from unity catalog (not all data types supported, yet)
- added the importer factory. This refactoring aims to make it easier to create new importers and consequently the growth and maintainability of the project. (#273)
datacontract export --format avrofixed array structure (#243)
- Test data contract against dataframes / temporary views (#175)
- AVRO export: Logical Types should be nested (#233)
- Fixed Docker build by removing msodbcsql18 dependency (temporary workaround)
- Added support for
sqlserver(#196) datacontract export --format dbml: Export to Database Markup Language (DBML) (#135)datacontract export --format avro: Now supports config map on field level for logicalTypes and default values Custom Avro Propertiesdatacontract import --format avro: Now supports importing logicalType and default definition on avro files Custom Avro Properties- Support
config.bigqueryTypefor testing BigQuery types - Added support for selecting specific tables in an AWS Glue
importthrough theglue-tableparameter (#122)
- Fixed jsonschema export for models with empty object-typed fields (#218)
- Fixed testing BigQuery tables with BOOL fields
datacontract catalogShow search bar also on mobile
datacontract catalogSearchdatacontract publish: Publish the data contract to the Data Mesh Managerdatacontract import --format bigquery: Import from BigQuery format (#110)datacontract export --format bigquery: Export to BigQuery format (#111)datacontract export --format avro: Now supports Avro logical types to better model date types.date,timestamp/timestamp-tzandtimestamp-ntzare now mapped to the appropriate logical types. (#141)datacontract import --format jsonschema: Import from JSON schema (#91)datacontract export --format jsonschema: Improved export by exporting more additional informationdatacontract export --format html: Added support for Service Levels, Definitions, Examples and nested Fieldsdatacontract export --format go: Export to go types format
- datacontract catalog: Add index.html to manifest
- Added import glue (#166)
- Added test support for
azure(#146) - Added support for
deltatables on S3 (#24) - Added new command
datacontract catalogthat generates a data contract catalog with anindex.htmlfile. - Added field format information to HTML export
- RDF Export: Fix error if owner is not a URI/URN
- Fixed docker columns
- Added timestamp when ah HTML export was created
- Fixed export format html
- Added export format html (#15)
- Added descriptions as comments to
datacontract export --format sqlfor Databricks dialects - Added import of arrays in Avro import
- Added export format great-expectations:
datacontract export --format great-expectations - Added gRPC support to OpenTelemetry integration for publishing test results
- Added AVRO import support for namespace (#121)
- Added handling for optional fields in avro import (#112)
- Added Databricks SQL dialect for
datacontract export --format sql
- Use
sql_type_converterto build checks. - Fixed AVRO import when doc is missing (#121)
- Added option publish test results to OpenTelemetry:
datacontract test --publish-to-opentelemetry - Added export format protobuf:
datacontract export --format protobuf - Added export format terraform:
datacontract export --format terraform(limitation: only works for AWS S3 right now) - Added export format sql:
datacontract export --format sql - Added export format sql-query:
datacontract export --format sql-query - Added export format avro-idl:
datacontract export --format avro-idl: Generates an Avro IDL file containing records for each model. - Added new command changelog:
datacontract changelog datacontract1.yaml datacontract2.yamlwill now generate a changelog based on the changes in the data contract. This will be useful for keeping track of changes in the data contract over time. - Added extensive linting on data contracts.
datacontract lintwill now check for a variety of possible errors in the data contract, such as missing descriptions, incorrect references to models or fields, nonsensical constraints, and more. - Added importer for avro schemas.
datacontract import --format avrowill now import avro schemas into a data contract.
- Fixed a bug where the export to YAML always escaped the unicode characters.
- test kafka for avro messages
- added export format avro:
datacontract export --format avro
This is a huge step forward, we now support testing Kafka messages. We start with JSON messages and avro, and Protobuf will follow.
- test kafka for JSON messages
- added import format sql:
datacontract import --format sql(#51) - added export format dbt-sources:
datacontract export --format dbt-sources - added export format dbt-staging-sql:
datacontract export --format dbt-staging-sql - added export format rdf:
datacontract export --format rdf(#52) - added command
datacontract breakingto detect breaking changes in between two data contracts.
- export to dbt models (#37).
- export to ODCS (#49).
- test - show a test summary table.
- lint - Support local schema (#46).
- Support for Postgres
- Support for Databricks
- Support for BigQuery data connection
- Support for multiple models with S3
- Fix Docker images. Disable builds for linux/amd64.
- Publish to Docker Hub
This is a breaking change (we are still on a 0.x.x version). The project migrated from Golang to Python. The Golang version can be found at cli-go
testSupport to directly run tests and connect to data sources defined in servers section.testgenerated schema tests from the model definition.test --publish URLPublish test results to a server URL.exportnow exports the data contract so format jsonschema and sodacl.
- The
--fileoption removed in favor of a direct argument.: Usedatacontract test datacontract.yamlinstead ofdatacontract test --file datacontract.yaml.
modelis now part ofexportqualityis now part ofexport- Temporary Removed:
diffneeds to be migrated to Python. - Temporary Removed:
breakingneeds to be migrated to Python. - Temporary Removed:
inlineneeds to be migrated to Python.
- Support local json schema in lint command.
- Update to specification 0.9.2.
- Fix format flag bug in model (print) command.
- Log to STDOUT.
- Rename
modelcommand parameter,type->format.
- Remove
schemacommand.
- Fix documentation.
- Security update of x/sys.
- Adapt Data Contract Specification in version 0.9.2.
- Use
modelssection fordiff/breaking. - Add
modelcommand. - Let
inlineprint to STDOUT instead of overwriting datacontract file. - Let
qualitywrite input from STDIN if present.
- Basic implementation of
testcommand for Soda Core.
- Change package structure to allow usage as library.
- Fix field parsing for dbt models, affects stability of
diff/breaking.
- Fix comparing order of contracts in
diff/breaking.
- Handle non-existent schema specification when using
diff/breaking. - Resolve local and remote resources such as schema specifications when using "$ref: ..." notation.
- Implement
schemacommand: prints your schema. - Implement
qualitycommand: prints your quality definitions. - Implement the
inlinecommand: resolves all references using the "$ref: ..." notation and writes them to your data contract.
- Allow remote and local location for all data contract inputs (
--file,--with).
- Add
diffcommand for dbt schema specification. - Add
breakingcommand for dbt schema specification.
- Suggest a fix during
initwhen the file already exists. - Rename
validatecommand tolint.
- Remove
check-compatibilitycommand.
- Improve usage documentation.
- Initial release.