Skip to content

Commit

Permalink
patch CI_HUB_TOKEN_PATH with Path instead of str
Browse files Browse the repository at this point in the history
  • Loading branch information
Wauplin committed Sep 26, 2022
1 parent 5eefb56 commit 2442b4c
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion tests/fixtures/hub.py
@@ -1,6 +1,7 @@
import os.path
import time
from contextlib import contextmanager
from pathlib import Path
from unittest.mock import patch

import pytest
Expand All @@ -16,7 +17,7 @@

CI_HUB_ENDPOINT = "https://hub-ci.huggingface.co"
CI_HUB_DATASETS_URL = CI_HUB_ENDPOINT + "/datasets/{repo_id}/resolve/{revision}/{path}"
CI_HUB_TOKEN_PATH = os.path.expanduser("~/.huggingface/hub_ci_token")
CI_HUB_TOKEN_PATH = Path("~/.huggingface/hub_ci_token").expanduser()


@pytest.fixture
Expand Down

1 comment on commit 2442b4c

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show benchmarks

PyArrow==6.0.0

Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.010700 / 0.011353 (-0.000653) 0.005150 / 0.011008 (-0.005858) 0.040398 / 0.038508 (0.001890) 0.038529 / 0.023109 (0.015420) 0.379029 / 0.275898 (0.103131) 0.462633 / 0.323480 (0.139153) 0.008029 / 0.007986 (0.000043) 0.004420 / 0.004328 (0.000091) 0.009074 / 0.004250 (0.004824) 0.055493 / 0.037052 (0.018441) 0.431311 / 0.258489 (0.172821) 0.468641 / 0.293841 (0.174800) 0.049645 / 0.128546 (-0.078902) 0.014997 / 0.075646 (-0.060650) 0.332468 / 0.419271 (-0.086803) 0.070196 / 0.043533 (0.026663) 0.423240 / 0.255139 (0.168101) 0.421520 / 0.283200 (0.138321) 0.120000 / 0.141683 (-0.021683) 1.713518 / 1.452155 (0.261363) 1.850276 / 1.492716 (0.357560)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.270635 / 0.018006 (0.252629) 0.579853 / 0.000490 (0.579364) 0.006617 / 0.000200 (0.006417) 0.000107 / 0.000054 (0.000053)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.026284 / 0.037411 (-0.011127) 0.117015 / 0.014526 (0.102490) 0.132252 / 0.176557 (-0.044305) 0.182028 / 0.737135 (-0.555107) 0.131301 / 0.296338 (-0.165038)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.608079 / 0.215209 (0.392870) 5.945425 / 2.077655 (3.867770) 2.407977 / 1.504120 (0.903857) 2.038149 / 1.541195 (0.496955) 2.096561 / 1.468490 (0.628071) 0.689237 / 4.584777 (-3.895540) 5.475804 / 3.745712 (1.730092) 5.084179 / 5.269862 (-0.185683) 2.623874 / 4.565676 (-1.941802) 0.085146 / 0.424275 (-0.339130) 0.014140 / 0.007607 (0.006533) 0.770871 / 0.226044 (0.544827) 7.713949 / 2.268929 (5.445021) 3.070138 / 55.444624 (-52.374486) 2.387559 / 6.876477 (-4.488917) 2.665075 / 2.142072 (0.523003) 0.899312 / 4.805227 (-3.905916) 0.174409 / 6.500664 (-6.326255) 0.068448 / 0.075469 (-0.007021)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.875201 / 1.841788 (0.033413) 16.353759 / 8.074308 (8.279451) 39.260890 / 10.191392 (29.069498) 1.137120 / 0.680424 (0.456697) 0.745878 / 0.534201 (0.211677) 0.449662 / 0.579283 (-0.129621) 0.618204 / 0.434364 (0.183840) 0.343210 / 0.540337 (-0.197128) 0.344317 / 1.386936 (-1.042619)
PyArrow==latest
Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.009371 / 0.011353 (-0.001982) 0.005552 / 0.011008 (-0.005456) 0.031810 / 0.038508 (-0.006698) 0.035880 / 0.023109 (0.012770) 0.454467 / 0.275898 (0.178569) 0.525025 / 0.323480 (0.201546) 0.004436 / 0.007986 (-0.003549) 0.004980 / 0.004328 (0.000651) 0.007272 / 0.004250 (0.003021) 0.046439 / 0.037052 (0.009386) 0.456909 / 0.258489 (0.198420) 0.503793 / 0.293841 (0.209952) 0.046693 / 0.128546 (-0.081853) 0.017033 / 0.075646 (-0.058614) 0.323983 / 0.419271 (-0.095288) 0.096009 / 0.043533 (0.052476) 0.458105 / 0.255139 (0.202966) 0.477930 / 0.283200 (0.194730) 0.131909 / 0.141683 (-0.009774) 1.692329 / 1.452155 (0.240174) 1.724066 / 1.492716 (0.231350)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.301374 / 0.018006 (0.283367) 0.593061 / 0.000490 (0.592572) 0.005648 / 0.000200 (0.005448) 0.000125 / 0.000054 (0.000071)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.027193 / 0.037411 (-0.010219) 0.111033 / 0.014526 (0.096507) 0.127055 / 0.176557 (-0.049502) 0.180804 / 0.737135 (-0.556331) 0.125420 / 0.296338 (-0.170918)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.649776 / 0.215209 (0.434567) 6.251325 / 2.077655 (4.173670) 2.738987 / 1.504120 (1.234867) 2.316074 / 1.541195 (0.774879) 2.441730 / 1.468490 (0.973240) 0.694346 / 4.584777 (-3.890430) 5.496525 / 3.745712 (1.750813) 5.480468 / 5.269862 (0.210607) 2.400663 / 4.565676 (-2.165014) 0.083354 / 0.424275 (-0.340922) 0.014224 / 0.007607 (0.006617) 0.756720 / 0.226044 (0.530675) 7.834362 / 2.268929 (5.565433) 3.320097 / 55.444624 (-52.124528) 2.873795 / 6.876477 (-4.002682) 2.857391 / 2.142072 (0.715318) 0.895277 / 4.805227 (-3.909951) 0.192159 / 6.500664 (-6.308505) 0.076604 / 0.075469 (0.001135)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.958562 / 1.841788 (0.116774) 16.493326 / 8.074308 (8.419018) 39.677461 / 10.191392 (29.486069) 1.212823 / 0.680424 (0.532400) 0.794790 / 0.534201 (0.260589) 0.498801 / 0.579283 (-0.080482) 0.619138 / 0.434364 (0.184774) 0.335631 / 0.540337 (-0.204707) 0.361792 / 1.386936 (-1.025144)

CML watermark

Please sign in to comment.