Skip to content

Commit

Permalink
fix imports
Browse files Browse the repository at this point in the history
  • Loading branch information
lhoestq committed Sep 9, 2022
1 parent a3f26c1 commit 2162de9
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 1 deletion.
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@
"langdetect",
"mauve-text",
"nltk",
# "rouge_score", # also required by bigbench
"rouge_score",
"sacrebleu",
"sacremoses",
"scikit-learn",
Expand Down
1 change: 1 addition & 0 deletions tests/test_metric_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,7 @@ def predict(self, data, *args, **kwargs):
yield


@pytest.mark.metrics_catalog
def test_seqeval_raises_when_incorrect_scheme():
metric = load_metric(os.path.join("metrics", "seqeval"))
wrong_scheme = "ERROR"
Expand Down

1 comment on commit 2162de9

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show benchmarks

PyArrow==6.0.0

Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.009507 / 0.011353 (-0.001846) 0.004440 / 0.011008 (-0.006568) 0.035153 / 0.038508 (-0.003355) 0.040975 / 0.023109 (0.017866) 0.352819 / 0.275898 (0.076921) 0.435455 / 0.323480 (0.111976) 0.006832 / 0.007986 (-0.001154) 0.003894 / 0.004328 (-0.000434) 0.008099 / 0.004250 (0.003848) 0.060472 / 0.037052 (0.023420) 0.364160 / 0.258489 (0.105671) 0.411536 / 0.293841 (0.117695) 0.035932 / 0.128546 (-0.092614) 0.010983 / 0.075646 (-0.064663) 0.307189 / 0.419271 (-0.112083) 0.059407 / 0.043533 (0.015874) 0.352161 / 0.255139 (0.097022) 0.380452 / 0.283200 (0.097252) 0.118134 / 0.141683 (-0.023548) 1.728444 / 1.452155 (0.276290) 1.907536 / 1.492716 (0.414820)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.226508 / 0.018006 (0.208502) 0.492179 / 0.000490 (0.491690) 0.007490 / 0.000200 (0.007291) 0.000103 / 0.000054 (0.000049)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.027577 / 0.037411 (-0.009834) 0.120127 / 0.014526 (0.105601) 0.133075 / 0.176557 (-0.043481) 0.187034 / 0.737135 (-0.550101) 0.136059 / 0.296338 (-0.160279)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.460185 / 0.215209 (0.244976) 4.625378 / 2.077655 (2.547723) 2.118960 / 1.504120 (0.614840) 1.851882 / 1.541195 (0.310687) 1.895050 / 1.468490 (0.426560) 0.469679 / 4.584777 (-4.115098) 4.431707 / 3.745712 (0.685994) 2.237583 / 5.269862 (-3.032279) 1.360854 / 4.565676 (-3.204822) 0.058789 / 0.424275 (-0.365486) 0.012710 / 0.007607 (0.005103) 0.572088 / 0.226044 (0.346044) 5.676228 / 2.268929 (3.407300) 2.607722 / 55.444624 (-52.836902) 2.198346 / 6.876477 (-4.678130) 2.354762 / 2.142072 (0.212690) 0.597392 / 4.805227 (-4.207835) 0.133625 / 6.500664 (-6.367039) 0.069491 / 0.075469 (-0.005978)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.790806 / 1.841788 (-0.050982) 16.525869 / 8.074308 (8.451561) 28.918352 / 10.191392 (18.726960) 1.054487 / 0.680424 (0.374063) 0.671973 / 0.534201 (0.137772) 0.463237 / 0.579283 (-0.116046) 0.529780 / 0.434364 (0.095416) 0.339930 / 0.540337 (-0.200407) 0.335587 / 1.386936 (-1.051350)
PyArrow==latest
Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.007002 / 0.011353 (-0.004351) 0.004391 / 0.011008 (-0.006617) 0.032972 / 0.038508 (-0.005536) 0.039707 / 0.023109 (0.016598) 0.453267 / 0.275898 (0.177369) 0.530382 / 0.323480 (0.206902) 0.004423 / 0.007986 (-0.003563) 0.005225 / 0.004328 (0.000896) 0.005681 / 0.004250 (0.001431) 0.049478 / 0.037052 (0.012426) 0.460157 / 0.258489 (0.201668) 0.520295 / 0.293841 (0.226454) 0.034900 / 0.128546 (-0.093646) 0.011058 / 0.075646 (-0.064588) 0.314668 / 0.419271 (-0.104604) 0.063124 / 0.043533 (0.019591) 0.465555 / 0.255139 (0.210417) 0.476572 / 0.283200 (0.193372) 0.117887 / 0.141683 (-0.023795) 1.775174 / 1.452155 (0.323019) 1.772635 / 1.492716 (0.279919)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.253943 / 0.018006 (0.235936) 0.489859 / 0.000490 (0.489370) 0.001080 / 0.000200 (0.000880) 0.000087 / 0.000054 (0.000033)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.028091 / 0.037411 (-0.009321) 0.121018 / 0.014526 (0.106492) 0.133496 / 0.176557 (-0.043060) 0.183412 / 0.737135 (-0.553723) 0.137980 / 0.296338 (-0.158358)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.506298 / 0.215209 (0.291089) 5.120300 / 2.077655 (3.042646) 2.623041 / 1.504120 (1.118921) 2.430875 / 1.541195 (0.889680) 2.555516 / 1.468490 (1.087026) 0.494299 / 4.584777 (-4.090478) 4.396585 / 3.745712 (0.650873) 2.309823 / 5.269862 (-2.960039) 1.403815 / 4.565676 (-3.161861) 0.060622 / 0.424275 (-0.363653) 0.013187 / 0.007607 (0.005580) 0.641226 / 0.226044 (0.415182) 6.519226 / 2.268929 (4.250297) 3.234891 / 55.444624 (-52.209734) 2.832624 / 6.876477 (-4.043853) 2.913168 / 2.142072 (0.771095) 0.635894 / 4.805227 (-4.169333) 0.143305 / 6.500664 (-6.357359) 0.073963 / 0.075469 (-0.001506)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.854357 / 1.841788 (0.012570) 16.361484 / 8.074308 (8.287176) 28.856664 / 10.191392 (18.665272) 1.091451 / 0.680424 (0.411027) 0.705602 / 0.534201 (0.171401) 0.455516 / 0.579283 (-0.123767) 0.530006 / 0.434364 (0.095642) 0.315877 / 0.540337 (-0.224461) 0.319592 / 1.386936 (-1.067344)

CML watermark

Please sign in to comment.