Skip to content

Commit

Permalink
patch CI_HUB_TOKEN_PATH with Path instead of str (#5026)
Browse files Browse the repository at this point in the history
* patch CI_HUB_TOKEN_PATH with Path instead of str

* flake8
  • Loading branch information
Wauplin authored and lhoestq committed Oct 5, 2022
1 parent 4a81477 commit 60dcc68
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions tests/fixtures/hub.py
@@ -1,6 +1,6 @@
import os.path
import time
from contextlib import contextmanager
from pathlib import Path
from unittest.mock import patch

import pytest
Expand All @@ -16,7 +16,7 @@

CI_HUB_ENDPOINT = "https://hub-ci.huggingface.co"
CI_HUB_DATASETS_URL = CI_HUB_ENDPOINT + "/datasets/{repo_id}/resolve/{revision}/{path}"
CI_HUB_TOKEN_PATH = os.path.expanduser("~/.huggingface/hub_ci_token")
CI_HUB_TOKEN_PATH = Path("~/.huggingface/hub_ci_token").expanduser()


@pytest.fixture
Expand Down

1 comment on commit 60dcc68

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show benchmarks

PyArrow==6.0.0

Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.007671 / 0.011353 (-0.003682) 0.003528 / 0.011008 (-0.007481) 0.028353 / 0.038508 (-0.010155) 0.029932 / 0.023109 (0.006823) 0.302538 / 0.275898 (0.026639) 0.363634 / 0.323480 (0.040154) 0.005405 / 0.007986 (-0.002580) 0.002899 / 0.004328 (-0.001429) 0.006598 / 0.004250 (0.002347) 0.040459 / 0.037052 (0.003407) 0.318691 / 0.258489 (0.060202) 0.353902 / 0.293841 (0.060061) 0.028801 / 0.128546 (-0.099745) 0.009283 / 0.075646 (-0.066364) 0.247163 / 0.419271 (-0.172108) 0.044229 / 0.043533 (0.000696) 0.312606 / 0.255139 (0.057467) 0.334286 / 0.283200 (0.051086) 0.090671 / 0.141683 (-0.051012) 1.455004 / 1.452155 (0.002849) 1.497702 / 1.492716 (0.004986)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.213717 / 0.018006 (0.195710) 0.437770 / 0.000490 (0.437280) 0.000887 / 0.000200 (0.000687) 0.000072 / 0.000054 (0.000017)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.020847 / 0.037411 (-0.016564) 0.096403 / 0.014526 (0.081877) 0.102697 / 0.176557 (-0.073859) 0.143859 / 0.737135 (-0.593276) 0.107501 / 0.296338 (-0.188838)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.407972 / 0.215209 (0.192763) 4.082106 / 2.077655 (2.004451) 1.850279 / 1.504120 (0.346159) 1.651276 / 1.541195 (0.110081) 1.717091 / 1.468490 (0.248601) 0.438458 / 4.584777 (-4.146319) 3.339719 / 3.745712 (-0.405993) 1.863584 / 5.269862 (-3.406278) 1.242080 / 4.565676 (-3.323596) 0.052711 / 0.424275 (-0.371565) 0.011065 / 0.007607 (0.003458) 0.520770 / 0.226044 (0.294726) 5.231280 / 2.268929 (2.962351) 2.282165 / 55.444624 (-53.162459) 1.929408 / 6.876477 (-4.947068) 2.024327 / 2.142072 (-0.117745) 0.556977 / 4.805227 (-4.248250) 0.116831 / 6.500664 (-6.383833) 0.063193 / 0.075469 (-0.012276)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.538069 / 1.841788 (-0.303719) 13.464903 / 8.074308 (5.390595) 26.378125 / 10.191392 (16.186733) 0.859324 / 0.680424 (0.178900) 0.575637 / 0.534201 (0.041436) 0.346809 / 0.579283 (-0.232474) 0.410702 / 0.434364 (-0.023662) 0.236744 / 0.540337 (-0.303593) 0.247791 / 1.386936 (-1.139145)
PyArrow==latest
Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.005784 / 0.011353 (-0.005568) 0.003819 / 0.011008 (-0.007189) 0.026956 / 0.038508 (-0.011552) 0.028701 / 0.023109 (0.005592) 0.337806 / 0.275898 (0.061907) 0.427359 / 0.323480 (0.103879) 0.003492 / 0.007986 (-0.004493) 0.004273 / 0.004328 (-0.000056) 0.004685 / 0.004250 (0.000434) 0.034234 / 0.037052 (-0.002819) 0.342468 / 0.258489 (0.083979) 0.398077 / 0.293841 (0.104236) 0.027451 / 0.128546 (-0.101095) 0.009595 / 0.075646 (-0.066051) 0.250010 / 0.419271 (-0.169262) 0.046661 / 0.043533 (0.003128) 0.340582 / 0.255139 (0.085443) 0.375432 / 0.283200 (0.092233) 0.091426 / 0.141683 (-0.050256) 1.602354 / 1.452155 (0.150200) 1.590339 / 1.492716 (0.097623)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.222898 / 0.018006 (0.204891) 0.400805 / 0.000490 (0.400316) 0.003326 / 0.000200 (0.003126) 0.000079 / 0.000054 (0.000024)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.021461 / 0.037411 (-0.015950) 0.096132 / 0.014526 (0.081606) 0.105232 / 0.176557 (-0.071325) 0.151259 / 0.737135 (-0.585876) 0.105280 / 0.296338 (-0.191058)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.468107 / 0.215209 (0.252898) 4.696040 / 2.077655 (2.618385) 2.410121 / 1.504120 (0.906001) 2.222717 / 1.541195 (0.681522) 2.295524 / 1.468490 (0.827034) 0.447037 / 4.584777 (-4.137740) 3.477434 / 3.745712 (-0.268278) 1.876801 / 5.269862 (-3.393061) 1.113717 / 4.565676 (-3.451960) 0.053066 / 0.424275 (-0.371209) 0.010838 / 0.007607 (0.003231) 0.576178 / 0.226044 (0.350134) 5.765954 / 2.268929 (3.497026) 2.881267 / 55.444624 (-52.563358) 2.546368 / 6.876477 (-4.330108) 2.649131 / 2.142072 (0.507059) 0.559157 / 4.805227 (-4.246070) 0.119128 / 6.500664 (-6.381536) 0.063841 / 0.075469 (-0.011628)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.593470 / 1.841788 (-0.248318) 13.060288 / 8.074308 (4.985980) 26.498694 / 10.191392 (16.307302) 0.905861 / 0.680424 (0.225437) 0.645381 / 0.534201 (0.111180) 0.345420 / 0.579283 (-0.233863) 0.396081 / 0.434364 (-0.038282) 0.230496 / 0.540337 (-0.309842) 0.242629 / 1.386936 (-1.144307)

CML watermark

Please sign in to comment.