Skip to content

Commit

Permalink
Use lower torchaudio version
Browse files Browse the repository at this point in the history
  • Loading branch information
mariosasko committed Jul 26, 2022
1 parent c3bc52d commit ad949dd
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion setup.py
Expand Up @@ -125,7 +125,7 @@
"s3fs>=2021.11.1", # aligned with fsspec[http]>=2021.11.1
"tensorflow>=2.3,!=2.6.0,!=2.6.1",
"torch",
"torchaudio",
"torchaudio<0.12.0",
"soundfile",
"transformers",
# datasets dependencies
Expand Down

1 comment on commit ad949dd

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show benchmarks

PyArrow==6.0.0

Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.008356 / 0.011353 (-0.002997) 0.004147 / 0.011008 (-0.006861) 0.031758 / 0.038508 (-0.006750) 0.035791 / 0.023109 (0.012682) 0.316014 / 0.275898 (0.040116) 0.379435 / 0.323480 (0.055955) 0.006265 / 0.007986 (-0.001721) 0.003672 / 0.004328 (-0.000656) 0.007140 / 0.004250 (0.002890) 0.054464 / 0.037052 (0.017412) 0.322564 / 0.258489 (0.064075) 0.359588 / 0.293841 (0.065747) 0.031219 / 0.128546 (-0.097327) 0.009740 / 0.075646 (-0.065906) 0.267174 / 0.419271 (-0.152098) 0.052233 / 0.043533 (0.008700) 0.302727 / 0.255139 (0.047588) 0.323189 / 0.283200 (0.039990) 0.104270 / 0.141683 (-0.037413) 1.511826 / 1.452155 (0.059671) 1.554930 / 1.492716 (0.062214)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.292580 / 0.018006 (0.274574) 0.539231 / 0.000490 (0.538742) 0.003105 / 0.000200 (0.002905) 0.000095 / 0.000054 (0.000040)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.026022 / 0.037411 (-0.011390) 0.107410 / 0.014526 (0.092884) 0.121489 / 0.176557 (-0.055068) 0.182674 / 0.737135 (-0.554461) 0.123812 / 0.296338 (-0.172527)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.408190 / 0.215209 (0.192981) 4.059304 / 2.077655 (1.981649) 1.861516 / 1.504120 (0.357396) 1.681103 / 1.541195 (0.139908) 1.796872 / 1.468490 (0.328382) 0.419524 / 4.584777 (-4.165252) 3.808383 / 3.745712 (0.062671) 3.781788 / 5.269862 (-1.488073) 1.791915 / 4.565676 (-2.773762) 0.050819 / 0.424275 (-0.373456) 0.011611 / 0.007607 (0.004004) 0.513705 / 0.226044 (0.287661) 5.165057 / 2.268929 (2.896129) 2.310719 / 55.444624 (-53.133906) 1.991031 / 6.876477 (-4.885446) 2.229845 / 2.142072 (0.087772) 0.537654 / 4.805227 (-4.267573) 0.116629 / 6.500664 (-6.384035) 0.061130 / 0.075469 (-0.014339)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.494398 / 1.841788 (-0.347389) 14.899301 / 8.074308 (6.824993) 25.241510 / 10.191392 (15.050118) 0.847714 / 0.680424 (0.167291) 0.525940 / 0.534201 (-0.008261) 0.387079 / 0.579283 (-0.192204) 0.434812 / 0.434364 (0.000448) 0.267193 / 0.540337 (-0.273144) 0.273550 / 1.386936 (-1.113386)
PyArrow==latest
Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.006318 / 0.011353 (-0.005035) 0.004120 / 0.011008 (-0.006888) 0.027878 / 0.038508 (-0.010630) 0.034743 / 0.023109 (0.011634) 0.345373 / 0.275898 (0.069475) 0.417873 / 0.323480 (0.094393) 0.004399 / 0.007986 (-0.003587) 0.003677 / 0.004328 (-0.000652) 0.005032 / 0.004250 (0.000782) 0.052084 / 0.037052 (0.015032) 0.350900 / 0.258489 (0.092411) 0.388321 / 0.293841 (0.094481) 0.029816 / 0.128546 (-0.098730) 0.009900 / 0.075646 (-0.065746) 0.257044 / 0.419271 (-0.162227) 0.054702 / 0.043533 (0.011169) 0.342097 / 0.255139 (0.086958) 0.373624 / 0.283200 (0.090424) 0.111470 / 0.141683 (-0.030213) 1.530767 / 1.452155 (0.078612) 1.572804 / 1.492716 (0.080088)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.377803 / 0.018006 (0.359797) 0.545037 / 0.000490 (0.544547) 0.024563 / 0.000200 (0.024363) 0.000142 / 0.000054 (0.000087)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.026000 / 0.037411 (-0.011412) 0.106446 / 0.014526 (0.091920) 0.121670 / 0.176557 (-0.054887) 0.174834 / 0.737135 (-0.562301) 0.123957 / 0.296338 (-0.172381)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.420497 / 0.215209 (0.205288) 4.170654 / 2.077655 (2.093000) 2.017578 / 1.504120 (0.513458) 1.835794 / 1.541195 (0.294600) 1.970495 / 1.468490 (0.502005) 0.430322 / 4.584777 (-4.154455) 3.791178 / 3.745712 (0.045466) 2.042226 / 5.269862 (-3.227636) 1.235142 / 4.565676 (-3.330534) 0.051087 / 0.424275 (-0.373188) 0.011035 / 0.007607 (0.003427) 0.521006 / 0.226044 (0.294961) 5.223455 / 2.268929 (2.954526) 2.464132 / 55.444624 (-52.980493) 2.132568 / 6.876477 (-4.743909) 2.341698 / 2.142072 (0.199625) 0.534612 / 4.805227 (-4.270615) 0.121395 / 6.500664 (-6.379269) 0.063381 / 0.075469 (-0.012088)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.479622 / 1.841788 (-0.362166) 14.465765 / 8.074308 (6.391457) 25.102668 / 10.191392 (14.911275) 0.892132 / 0.680424 (0.211708) 0.557088 / 0.534201 (0.022887) 0.389737 / 0.579283 (-0.189546) 0.431138 / 0.434364 (-0.003226) 0.267137 / 0.540337 (-0.273200) 0.273966 / 1.386936 (-1.112970)

CML watermark

Please sign in to comment.