Skip to content

Releases: JDASoftwareGroup/kartothek

Kartothek v5.3.0

10 Dec 09:44
1821ea5
Compare
Choose a tag to compare

Version 5.3.0 (2021-12-10)

  • Add Deprecation warnings and migration helpers in order to
    facilitate the Kartothek version 6.0.0 migration.
  • Removed warning for distinct categoricals (#501)

Kartothek v5.2.0

22 Nov 08:49
Compare
Choose a tag to compare

Version 5.2.0 (2021-11-22)

  • Remove support for Python 3.6
  • Allow pyarrow<7 as a dependency.

Kartothek v5.1.0

05 Jul 11:41
efd0bc5
Compare
Choose a tag to compare

Version 5.1.0 (2021-07-05)

  • Add ~kartothek.io.eager.copy_dataset{.interpreted-text
    role="meth"} to copy and optionally rename datasets within one store
    or between stores (eager only)
  • Add renaming option to
    ~kartothek.io.eager_cube.copy_cube{.interpreted-text role="meth"}
  • Add predicates to cube condition converter to
    ~kartothek.utils.predicate_converter{.interpreted-text
    role="meth"}

Kartothek v5.0.0

23 Jun 09:28
Compare
Choose a tag to compare

Version 5.0.0 (2021-06-23)

This release rolls all the changes introduced with 4.x back to 3.20.0.

As the incompatibility between 4.0 and 5.0 will be an issue for some
customers, we encourage you to use the very stable kartothek 3.20.0 and
not version 4.x.

Please refer the Issue #471 for further information.

Kartothek v5.0.0rc1

11 Jun 19:39
Compare
Choose a tag to compare

Version 5.0.0 (2021-05-xx)

This release rolls all the changes introduced with 4.x back to 3.20.0.

As the incompatibility between 4.0 and 5.0 will be an issue for some
customers, we encourage you to use the very stable kartothek 3.20.0 and
not version 4.x.

Please refer the Issue #471 for further information.

Kartothek v4.0.3

11 Jun 19:44
Compare
Choose a tag to compare

Kartothek 4.0.3 (2021-06-10)

  • Pin dask to not use 2021.5.1 and 2020.6.0 (#475)

Kartothek v4.0.1

13 Apr 13:57
Compare
Choose a tag to compare

Kartothek 4.0.1 (2021-04-13)

  • Fixed dataset corruption after updates when table names other than
    "table" are used (#445).

Kartothek v4.0.0

17 Mar 17:04
08a8094
Compare
Choose a tag to compare

Kartothek 4.0.0 (2021-03-17)

This is a major release of kartothek with breaking API changes.

  • Removal of complex user input (see gh427)
  • Removal of multi table feature
  • Removal of [kartothek.io.merge]{.title-ref} module
  • class ~kartothek.core.dataset.DatasetMetadata{.interpreted-text
    role="class"} now has an attribute called [schema]{.title-ref} which
    replaces the previous attribute [table_meta]{.title-ref} and returns
    only a single schema
  • All outputs which previously returned a sequence of dictionaries
    where each key-value pair would correspond to a table-data pair now
    returns only one pandas.DataFrame{.interpreted-text role="class"}
  • All read pipelines will now automatically infer the table to read
    such that it is no longer necessary to provide [table]{.title-ref}
    or [table_name]{.title-ref} as an input argument
  • All writing pipelines which previously supported a complex user
    input type now expose an argument [table_name]{.title-ref} which can
    be used to continue usage of legacy datasets (i.e. datasets with an
    intrinsic, non-trivial table name). This usage is discouraged and we
    recommend users to migrate to a default table name (i.e. leave it
    None / [table]{.title-ref})
  • All pipelines which previously accepted an argument
    [tables]{.title-ref} to select the subset of tables to load no
    longer accept this keyword. Instead the to-be-loaded table will be
    inferred
  • Trying to read a multi-tabled dataset will now cause an exception
    telling users that this is no longer supported with kartothek 4.0
  • The dict schema for
    ~kartothek.core.dataset.DatasetMetadataBase.to_dict{.interpreted-text
    role="meth"} and
    ~kartothek.core.dataset.DatasetMetadata.from_dict{.interpreted-text
    role="meth"} changed replacing a dictionary in
    [table_meta]{.title-ref} with the simple [schema]{.title-ref}
  • All pipeline arguments which previously accepted a dictionary of
    sequences to describe a table specific subset of columns now accept
    plain sequences (e.g. [columns]{.title-ref},
    [categoricals]{.title-ref})
  • Remove the following list of deprecated arguments for io pipelines
    • label_filter
    • central_partition_metadata
    • load_dynamic_metadata
    • load_dataset_metadata
    • concat_partitions_on_primary_index
  • Remove [output_dataset_uuid]{.title-ref} and
    [df_serializer]{.title-ref} from
    kartothek.io.eager.commit_dataset{.interpreted-text role="func"}
    since these arguments didn't have any effect
  • Remove [metadata]{.title-ref}, [df_serializer]{.title-ref},
    [overwrite]{.title-ref}, [metadata_merger]{.title-ref} from
    kartothek.io.eager.write_single_partition{.interpreted-text
    role="func"}
  • ~kartothek.io.eager.store_dataframes_as_dataset{.interpreted-text
    role="func"} now requires a list as an input
  • Default value for argument [date_as_object]{.title-ref} is now
    universally set to True. The behaviour for [False]{.title-ref}
    will be deprecated and removed in the next major release
  • No longer allow to pass [delete_scope]{.title-ref} as a delayed
    object to
    ~kartothek.io.dask.dataframe.update_dataset_from_ddf{.interpreted-text
    role="func"}
  • ~kartothek.io.dask.dataframe.update_dataset_from_ddf{.interpreted-text
    role="func"} and
    ~kartothek.io.dask.dataframe.store_dataset_from_ddf{.interpreted-text
    role="func"} now return a [dd.core.Scalar]{.title-ref} object. This
    enables all [dask.DataFrame]{.title-ref} graph optimizations by
    default.
  • Remove argument [table_name]{.title-ref} from
    ~kartothek.io.dask.dataframe.collect_dataset_metadata{.interpreted-text
    role="func"}

Kartothek v3.20.0

15 Mar 10:22
Compare
Choose a tag to compare

Version 3.20.0 (2021-03-15)

This will be the final release in the 3.X series. Please ensure your
existing codebase does not raise any DeprecationWarning from kartothek
and migrate your import paths ahead of time to the new
kartothek.api{.interpreted-text role="mod"} modules to ensure a smooth
migration to 4.X.

  • Introduce kartothek.api{.interpreted-text role="mod"} as the
    public definition of the API. See also
    versioning{.interpreted-text role="doc"}.
  • Introduce [DatasetMetadataBase.schema]{.title-ref} to prepare
    deprecation of [table_meta]{.title-ref}
  • ~kartothek.io.eager.read_dataset_as_dataframes{.interpreted-text
    role="func"} and
    ~kartothek.io.iter.read_dataset_as_dataframes__iterator{.interpreted-text
    role="func"} now correctly return categoricals as requested for
    misaligned categories.

Kartothek v3.19.1

24 Feb 11:20
Compare
Choose a tag to compare

Version 3.19.1 (2021-02-24)

  • Allow pyarrow==3 as a dependency.
  • Fix a bug in
    ~kartothek.io_components.utils.align_categories{.interpreted-text
    role="func"} for dataframes with missings and of non-categorical
    dtype.
  • Fix an issue with the cube index validation introduced in v3.19.0
    (#413).