Release and Version History#
x.y.z (Backlog)#
Features and Improvements
Minor Improvements
Bugfixes
Miscellaneous
0.3.1 (2024-08-27)#
💥Breaking Changes
- Removed the following public APIs. We no longer uses parameter to custom the
batch_read_snapshot_data_file_funclogic, all the data transformation logic should be implemented in thebatch_read_snapshot_data_file_funcfunction. dbsnaplake.api.T_EXTRACTORdbsnaplake.api.DerivedColumn
- Removed the following public APIs. We no longer uses parameter to custom the
- Removed the following writer. We start using polars_writer to write parquet files.
dbsnaplake.api.write_parquet_to_s3dbsnaplake.api.write_data_file
- Add
polars_writerparameter to the following API: dbsnaplake.api.step_2_3_process_partition_file_group_manifest_filedbsnaplake.api.Project
- Add
Features and Improvements
No longer force to use parquet as the datalake format. Now you can use any format that supported by
polars.- Add the following public APIs:
dbsnaplake.api.constants.S3_METADATA_KEY_N_RECORD
0.2.1 (2024-08-16)#
Minor Improvements
Now the
columnparameter is optional indbsnaplake.api.validate_datalake.- Add the following public APIs:
dbsnaplake.api.S3Location.s3path_validate_datalake_resultdbsnaplake.api.step_3_1_validate_datalakedbsnaplake.api.Project.step_3_1_validate_datalake
0.1.2 (2024-08-16)#
Features and Improvements
- Add the following public APIs that forgot to add:
dbsnaplake.api.ValidateDatalakeResultdbsnaplake.api.validate_datalake
0.1.1 (2024-08-15)#
First release
Add the following public APIs:
dbsnaplake.api.constantsdbsnaplake.api.constants.COL_RECORD_IDdbsnaplake.api.constants.COL_CREATE_TIMEdbsnaplake.api.constants.COL_UPDATE_TIMEdbsnaplake.api.constants.S3_METADATA_KEY_SIZEdbsnaplake.api.constants.S3_METADATA_KEY_N_RECORDdbsnaplake.api.constants.S3_METADATA_KEY_SNAPSHOT_DATA_FILEdbsnaplake.api.constants.S3_METADATA_KEY_STAGING_PARTITIONdbsnaplake.api.constants.MANIFESTS_FOLDERdbsnaplake.api.constants.DATALAKE_FOLDERdbsnaplake.api.constants.SNAPSHOT_FILE_GROUPS_FOLDERdbsnaplake.api.constants.STAGING_FILE_GROUPS_FOLDERdbsnaplake.api.constants.PARTITION_FILE_GROUPS_FOLDERdbsnaplake.api.constants.MANIFEST_SUMMARY_FOLDERdbsnaplake.api.constants.MANIFEST_DATA_FOLDERdbsnaplake.api.T_RECORDdbsnaplake.api.T_DF_SCHEMAdbsnaplake.api.T_EXTRACTORdbsnaplake.api.T_OPTIONAL_KWARGSdbsnaplake.api.repr_data_sizedbsnaplake.api.S3Locationdbsnaplake.api.Partitiondbsnaplake.api.extract_partition_datadbsnaplake.api.encode_hive_partitiondbsnaplake.api.get_s3dir_partitiondbsnaplake.api.get_partitionsdbsnaplake.api.write_parquet_to_s3dbsnaplake.api.write_data_filedbsnaplake.api.read_parquet_from_s3dbsnaplake.api.read_many_parquet_from_s3dbsnaplake.api.group_by_partitiondbsnaplake.api.get_merged_schemadbsnaplake.api.harmonize_schemasdbsnaplake.api.dummy_loggerdbsnaplake.api.DBSnapshotManifestFiledbsnaplake.api.DBSnapshotManifestFile.split_into_groupsdbsnaplake.api.DBSnapshotFileGroupManifestFiledbsnaplake.api.DBSnapshotFileGroupManifestFile.read_all_groupsdbsnaplake.api.DerivedColumndbsnaplake.api.StagingFileGroupManifestFiledbsnaplake.api.T_BatchReadSnapshotDataFileCallabledbsnaplake.api.process_db_snapshot_file_group_manifest_filedbsnaplake.api.extract_s3_directorydbsnaplake.api.PartitionFileGroupManifestFiledbsnaplake.api.PartitionFileGroupManifestFile.plan_partition_compactiondbsnaplake.api.PartitionFileGroupManifestFile.read_all_groupsdbsnaplake.api.process_partition_file_group_manifest_filedbsnaplake.api.T_TASKdbsnaplake.api.create_orm_modeldbsnaplake.api.step_1_1_plan_snapshot_to_stagingdbsnaplake.api.step_1_2_get_snapshot_to_staging_todo_listdbsnaplake.api.step_1_3_process_db_snapshot_file_group_manifest_filedbsnaplake.api.step_2_1_plan_staging_to_datalakedbsnaplake.api.step_2_2_get_staging_to_datalake_todo_listdbsnaplake.api.step_2_3_process_partition_file_group_manifest_filedbsnaplake.api.loggerdbsnaplake.api.Projectdbsnaplake.api.Project.step_1_1_plan_snapshot_to_stagingdbsnaplake.api.Project.step_1_2_process_db_snapshot_file_group_manifest_filedbsnaplake.api.Project.step_2_1_plan_staging_to_datalakedbsnaplake.api.Project.step_2_2_process_partition_file_group_manifest_file