Skip to content

v0.24.0

Compare
Choose a tag to compare
@lars-reimann lars-reimann released this 09 May 13:30
· 29 commits to main since this release

0.24.0 (2024-05-09)

This release features completely rewritten containers for tabular data (currently experimental). They use the extremely fast polars library as their backend. Together with a drastically more efficient implementation of our own interface, operations on tabular data are now as fast as they should be.

Previously, even operations on small tables (10000 rows x 50 columns) took very long, as this comparison of Table methods shows:

method old (s) new (s) speedup (factor)
remove_duplicate_rows 0.25474 0.01306 19.5
remove_rows_with_missing_values 0.25159 0.00946 26.6
remove_rows_with_outliers 0.28816 0.01034 27.9
remove_rows 2.69647 0.00242 1114.2
shuffle_rows 0.24690 0.00204 121.0
slice_rows 0.12313 0.00011 1119.4
sort_rows 4.67574 0.00372 1256.9
split_rows 0.24764 0.00219 113.1
transform_column 2.89572 0.00030 9652.4

You can find a full list of changes below. Special thanks to all contributors:

Features

  • Column.plot_histogram() using Table.plot_histograms for consistent results (#726) (576492c)
  • Regressor.summarize_metrics and Classifier.summarize_metrics (#729) (1cc14b1), closes #713
  • Add ImageDataset and Layer for ConvolutionalNeuralNetworks (#645) (5b6d219), closes #579 #580 #581
  • added load_percentage parameter to ImageList.from_files to load a subset of the given files (#739) (0564b52), closes #736
  • added rnn layer and TimeSeries conversion (#615) (6cad203), closes #614 #648 #656 #601
  • Basic implementation of cell with polars (#734) (004630b), closes #712
  • deprecate Table.add_column and Table.add_row (#723) (5dd9d02), closes #722
  • deprecated Table.from_excel_file and Table.to_excel_file (#728) (c89e0bf), closes #727
  • Larger histogram plot if table only has one column (#716) (31ffd12)
  • polars implementation of a column (#738) (732aa48), closes #712
  • polars implementation of a row (#733) (ff627f6), closes #712
  • polars implementation of table (#744) (fc49895), closes #638 #641 #649 #712
  • regularization for decision trees and random forests (#730) (102de2d), closes #700
  • Remove device information in image class (#735) (d783caa), closes #524
  • return fitted transformer and transformed table from fit_and_transform (#724) (2960d35), closes #613

Bug Fixes

Performance Improvements

  • improved performance of TabularDataset.__eq__ by a factor of up to 2 (#697) (cd7f55b)