Skip to content

Commit

Permalink
Don't add a tag on the Hub on release (#4998)
Browse files Browse the repository at this point in the history
don't add a tag on the Hub on release
  • Loading branch information
lhoestq committed Sep 20, 2022
1 parent 142404f commit ace149f
Showing 1 changed file with 2 additions and 7 deletions.
9 changes: 2 additions & 7 deletions .github/hub/update_hub_repositories.py
Original file line number Diff line number Diff line change
Expand Up @@ -194,13 +194,8 @@ def __call__(self, dataset_name: str) -> bool:
commit_args += (f"-m Commit from {DATASETS_LIB_COMMIT_URL.format(hexsha=current_commit.hexsha)}",)
commit_args += (f"--author={author_name} <{author_email}>",)

for _tag in datasets_lib_repo.tags:
# Add a new tag if this is a `datasets` release
if _tag.commit == current_commit and re.match(r"^[0-9]+\.[0-9]+\.[0-9]+$", _tag.name):
new_tag = _tag
break
else:
new_tag = None
# we don't add a new tag as we used to when there's a release
new_tag = None

changed_files_since_last_commit = [
path
Expand Down

1 comment on commit ace149f

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show benchmarks

PyArrow==6.0.0

Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.007588 / 0.011353 (-0.003764) 0.003711 / 0.011008 (-0.007297) 0.028579 / 0.038508 (-0.009929) 0.029134 / 0.023109 (0.006024) 0.306355 / 0.275898 (0.030457) 0.366351 / 0.323480 (0.042871) 0.005519 / 0.007986 (-0.002467) 0.003120 / 0.004328 (-0.001208) 0.006591 / 0.004250 (0.002340) 0.039510 / 0.037052 (0.002458) 0.317224 / 0.258489 (0.058735) 0.361090 / 0.293841 (0.067249) 0.029065 / 0.128546 (-0.099481) 0.009467 / 0.075646 (-0.066179) 0.248190 / 0.419271 (-0.171082) 0.046645 / 0.043533 (0.003112) 0.308224 / 0.255139 (0.053085) 0.337023 / 0.283200 (0.053824) 0.089928 / 0.141683 (-0.051755) 1.587022 / 1.452155 (0.134868) 1.601789 / 1.492716 (0.109073)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.203528 / 0.018006 (0.185521) 0.417651 / 0.000490 (0.417161) 0.001686 / 0.000200 (0.001486) 0.000073 / 0.000054 (0.000018)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.021058 / 0.037411 (-0.016353) 0.090247 / 0.014526 (0.075721) 0.099371 / 0.176557 (-0.077186) 0.138280 / 0.737135 (-0.598855) 0.103029 / 0.296338 (-0.193309)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.408684 / 0.215209 (0.193475) 4.086254 / 2.077655 (2.008599) 1.840724 / 1.504120 (0.336604) 1.638087 / 1.541195 (0.096892) 1.658803 / 1.468490 (0.190313) 0.438072 / 4.584777 (-4.146705) 3.353337 / 3.745712 (-0.392375) 2.859013 / 5.269862 (-2.410848) 1.482290 / 4.565676 (-3.083387) 0.052958 / 0.424275 (-0.371317) 0.010735 / 0.007607 (0.003128) 0.519285 / 0.226044 (0.293241) 5.202973 / 2.268929 (2.934044) 2.282010 / 55.444624 (-53.162614) 1.937781 / 6.876477 (-4.938696) 2.003646 / 2.142072 (-0.138426) 0.557867 / 4.805227 (-4.247360) 0.116320 / 6.500664 (-6.384344) 0.061859 / 0.075469 (-0.013611)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.499539 / 1.841788 (-0.342248) 12.339507 / 8.074308 (4.265199) 26.209782 / 10.191392 (16.018390) 0.828458 / 0.680424 (0.148034) 0.558925 / 0.534201 (0.024724) 0.346887 / 0.579283 (-0.232396) 0.398206 / 0.434364 (-0.036158) 0.233846 / 0.540337 (-0.306491) 0.252411 / 1.386936 (-1.134525)
PyArrow==latest
Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.005540 / 0.011353 (-0.005813) 0.003565 / 0.011008 (-0.007443) 0.026886 / 0.038508 (-0.011622) 0.027447 / 0.023109 (0.004338) 0.416179 / 0.275898 (0.140281) 0.474001 / 0.323480 (0.150521) 0.003462 / 0.007986 (-0.004524) 0.004227 / 0.004328 (-0.000102) 0.004792 / 0.004250 (0.000541) 0.036750 / 0.037052 (-0.000302) 0.422249 / 0.258489 (0.163760) 0.460576 / 0.293841 (0.166735) 0.027260 / 0.128546 (-0.101286) 0.009396 / 0.075646 (-0.066250) 0.248748 / 0.419271 (-0.170524) 0.048764 / 0.043533 (0.005231) 0.420943 / 0.255139 (0.165804) 0.445714 / 0.283200 (0.162514) 0.089205 / 0.141683 (-0.052478) 1.509852 / 1.452155 (0.057698) 1.534527 / 1.492716 (0.041811)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.255551 / 0.018006 (0.237545) 0.405696 / 0.000490 (0.405206) 0.023164 / 0.000200 (0.022964) 0.000219 / 0.000054 (0.000164)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.020105 / 0.037411 (-0.017306) 0.093269 / 0.014526 (0.078743) 0.100789 / 0.176557 (-0.075768) 0.140003 / 0.737135 (-0.597132) 0.104446 / 0.296338 (-0.191892)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.431197 / 0.215209 (0.215988) 4.297359 / 2.077655 (2.219705) 2.015842 / 1.504120 (0.511723) 1.834814 / 1.541195 (0.293620) 1.849678 / 1.468490 (0.381188) 0.446341 / 4.584777 (-4.138436) 3.338914 / 3.745712 (-0.406798) 1.802312 / 5.269862 (-3.467550) 1.091425 / 4.565676 (-3.474252) 0.053212 / 0.424275 (-0.371063) 0.010932 / 0.007607 (0.003325) 0.543494 / 0.226044 (0.317449) 5.434947 / 2.268929 (3.166019) 2.471340 / 55.444624 (-52.973284) 2.140305 / 6.876477 (-4.736172) 2.210892 / 2.142072 (0.068820) 0.558632 / 4.805227 (-4.246595) 0.117740 / 6.500664 (-6.382924) 0.062745 / 0.075469 (-0.012724)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.560853 / 1.841788 (-0.280935) 12.378969 / 8.074308 (4.304661) 26.163603 / 10.191392 (15.972211) 0.929253 / 0.680424 (0.248829) 0.638740 / 0.534201 (0.104539) 0.349062 / 0.579283 (-0.230221) 0.418454 / 0.434364 (-0.015910) 0.239639 / 0.540337 (-0.300698) 0.244230 / 1.386936 (-1.142706)

CML watermark

Please sign in to comment.