Skip to content

Commit

Permalink
Use python 3.7 in ubuntu-latest
Browse files Browse the repository at this point in the history
  • Loading branch information
mariosasko committed Jul 22, 2022
1 parent 983b04e commit eac1aaa
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 8 deletions.
10 changes: 2 additions & 8 deletions .github/workflows/ci.yml
Expand Up @@ -21,7 +21,7 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.6"
python-version: "3.7"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
Expand Down Expand Up @@ -49,21 +49,15 @@ jobs:
- uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Set up Python 3.6
if: ${{ matrix.os == 'ubuntu-latest' }}
uses: actions/setup-python@v4
with:
python-version: 3.6
- name: Set up Python 3.7
if: ${{ matrix.os == 'windows-latest' }}
uses: actions/setup-python@v4
with:
python-version: 3.7
- name: Upgrade pip
run: python -m pip install --upgrade pip
- name: Pin setuptools-scm
if: ${{ matrix.os == 'ubuntu-latest' }}
run: echo "installing pinned version of setuptools-scm to fix seqeval installation on 3.6" && pip install "setuptools-scm==6.4.2"
run: echo "installing pinned version of setuptools-scm to fix seqeval installation on 3.7" && pip install "setuptools-scm==6.4.2"
- name: Install dependencies
run: |
pip install .[tests]
Expand Down
1 change: 1 addition & 0 deletions src/datasets/packaged_modules/text/dataset_infos.json
@@ -0,0 +1 @@
{"bigscience": {"description": "", "citation": "", "homepage": "", "license": "", "features": {"text": {"dtype": "string", "id": null, "_type": "Value"}}, "post_processed": null, "supervised_keys": null, "task_templates": null, "builder_name": "text", "config_name": "bigscience", "version": {"version_str": "0.0.0", "description": null, "major": 0, "minor": 0, "patch": 0}, "splits": {"train": {"name": "train", "num_bytes": 938, "num_examples": 22, "dataset_name": "text"}}, "download_checksums": {"C:\\Users\\Mario\\Desktop\\bigscience\\biscience.txt": {"num_bytes": 892, "checksum": "1e1f85c9e2aefb6990dc6ec4a8805af1e5451ebecb7e9f50face10c83eed742e"}}, "download_size": 892, "post_processing_size": null, "dataset_size": 938, "size_in_bytes": 1830}}

1 comment on commit eac1aaa

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show benchmarks

PyArrow==6.0.0

Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.007398 / 0.011353 (-0.003955) 0.003577 / 0.011008 (-0.007431) 0.028471 / 0.038508 (-0.010037) 0.029276 / 0.023109 (0.006167) 0.298349 / 0.275898 (0.022451) 0.356854 / 0.323480 (0.033374) 0.005384 / 0.007986 (-0.002601) 0.004106 / 0.004328 (-0.000223) 0.006743 / 0.004250 (0.002493) 0.039688 / 0.037052 (0.002635) 0.315680 / 0.258489 (0.057191) 0.357403 / 0.293841 (0.063562) 0.029256 / 0.128546 (-0.099291) 0.009222 / 0.075646 (-0.066424) 0.253887 / 0.419271 (-0.165384) 0.045018 / 0.043533 (0.001485) 0.301813 / 0.255139 (0.046674) 0.331494 / 0.283200 (0.048294) 0.085504 / 0.141683 (-0.056179) 1.493479 / 1.452155 (0.041325) 1.526867 / 1.492716 (0.034151)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.199465 / 0.018006 (0.181459) 0.427513 / 0.000490 (0.427023) 0.011037 / 0.000200 (0.010837) 0.000215 / 0.000054 (0.000161)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.020698 / 0.037411 (-0.016713) 0.091616 / 0.014526 (0.077090) 0.105453 / 0.176557 (-0.071104) 0.147412 / 0.737135 (-0.589723) 0.107617 / 0.296338 (-0.188722)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.409456 / 0.215209 (0.194247) 4.078044 / 2.077655 (2.000389) 1.840016 / 1.504120 (0.335896) 1.644390 / 1.541195 (0.103195) 1.663008 / 1.468490 (0.194518) 0.444545 / 4.584777 (-4.140232) 3.341038 / 3.745712 (-0.404674) 2.858492 / 5.269862 (-2.411369) 1.459815 / 4.565676 (-3.105862) 0.053061 / 0.424275 (-0.371214) 0.010959 / 0.007607 (0.003352) 0.520800 / 0.226044 (0.294755) 5.228616 / 2.268929 (2.959687) 2.252439 / 55.444624 (-53.192185) 1.939445 / 6.876477 (-4.937032) 2.004834 / 2.142072 (-0.137239) 0.559802 / 4.805227 (-4.245425) 0.118220 / 6.500664 (-6.382444) 0.064008 / 0.075469 (-0.011461)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.507930 / 1.841788 (-0.333858) 12.570237 / 8.074308 (4.495929) 26.318169 / 10.191392 (16.126777) 0.835742 / 0.680424 (0.155318) 0.546482 / 0.534201 (0.012281) 0.345564 / 0.579283 (-0.233719) 0.401868 / 0.434364 (-0.032496) 0.239905 / 0.540337 (-0.300432) 0.243738 / 1.386936 (-1.143198)
PyArrow==latest
Show updated benchmarks!

Benchmark: benchmark_array_xd.json

metric read_batch_formatted_as_numpy after write_array2d read_batch_formatted_as_numpy after write_flattened_sequence read_batch_formatted_as_numpy after write_nested_sequence read_batch_unformated after write_array2d read_batch_unformated after write_flattened_sequence read_batch_unformated after write_nested_sequence read_col_formatted_as_numpy after write_array2d read_col_formatted_as_numpy after write_flattened_sequence read_col_formatted_as_numpy after write_nested_sequence read_col_unformated after write_array2d read_col_unformated after write_flattened_sequence read_col_unformated after write_nested_sequence read_formatted_as_numpy after write_array2d read_formatted_as_numpy after write_flattened_sequence read_formatted_as_numpy after write_nested_sequence read_unformated after write_array2d read_unformated after write_flattened_sequence read_unformated after write_nested_sequence write_array2d write_flattened_sequence write_nested_sequence
new / old (diff) 0.005390 / 0.011353 (-0.005963) 0.003486 / 0.011008 (-0.007522) 0.026956 / 0.038508 (-0.011552) 0.027341 / 0.023109 (0.004231) 0.352008 / 0.275898 (0.076110) 0.374875 / 0.323480 (0.051395) 0.003263 / 0.007986 (-0.004722) 0.002918 / 0.004328 (-0.001411) 0.004567 / 0.004250 (0.000316) 0.036812 / 0.037052 (-0.000240) 0.344188 / 0.258489 (0.085699) 0.371225 / 0.293841 (0.077384) 0.026465 / 0.128546 (-0.102081) 0.009444 / 0.075646 (-0.066202) 0.250991 / 0.419271 (-0.168281) 0.053550 / 0.043533 (0.010017) 0.323901 / 0.255139 (0.068762) 0.354952 / 0.283200 (0.071753) 0.092888 / 0.141683 (-0.048795) 1.476007 / 1.452155 (0.023853) 1.511409 / 1.492716 (0.018693)

Benchmark: benchmark_getitem_100B.json

metric get_batch_of_1024_random_rows get_batch_of_1024_rows get_first_row get_last_row
new / old (diff) 0.174903 / 0.018006 (0.156897) 0.398260 / 0.000490 (0.397770) 0.013084 / 0.000200 (0.012884) 0.000263 / 0.000054 (0.000208)

Benchmark: benchmark_indices_mapping.json

metric select shard shuffle sort train_test_split
new / old (diff) 0.022113 / 0.037411 (-0.015298) 0.093939 / 0.014526 (0.079413) 0.107060 / 0.176557 (-0.069496) 0.154477 / 0.737135 (-0.582658) 0.110262 / 0.296338 (-0.186077)

Benchmark: benchmark_iterating.json

metric read 5000 read 50000 read_batch 50000 10 read_batch 50000 100 read_batch 50000 1000 read_formatted numpy 5000 read_formatted pandas 5000 read_formatted tensorflow 5000 read_formatted torch 5000 read_formatted_batch numpy 5000 10 read_formatted_batch numpy 5000 1000 shuffled read 5000 shuffled read 50000 shuffled read_batch 50000 10 shuffled read_batch 50000 100 shuffled read_batch 50000 1000 shuffled read_formatted numpy 5000 shuffled read_formatted_batch numpy 5000 10 shuffled read_formatted_batch numpy 5000 1000
new / old (diff) 0.416567 / 0.215209 (0.201358) 4.145955 / 2.077655 (2.068301) 1.881375 / 1.504120 (0.377255) 1.677843 / 1.541195 (0.136649) 1.690085 / 1.468490 (0.221594) 0.449052 / 4.584777 (-4.135725) 3.353285 / 3.745712 (-0.392427) 1.818214 / 5.269862 (-3.451647) 1.088354 / 4.565676 (-3.477322) 0.052947 / 0.424275 (-0.371328) 0.010710 / 0.007607 (0.003103) 0.516856 / 0.226044 (0.290811) 5.163197 / 2.268929 (2.894269) 2.304279 / 55.444624 (-53.140345) 1.948118 / 6.876477 (-4.928358) 2.008183 / 2.142072 (-0.133889) 0.558282 / 4.805227 (-4.246945) 0.119774 / 6.500664 (-6.380890) 0.064154 / 0.075469 (-0.011315)

Benchmark: benchmark_map_filter.json

metric filter map fast-tokenizer batched map identity map identity batched map no-op batched map no-op batched numpy map no-op batched pandas map no-op batched pytorch map no-op batched tensorflow
new / old (diff) 1.530159 / 1.841788 (-0.311629) 12.670106 / 8.074308 (4.595798) 26.090362 / 10.191392 (15.898970) 0.867692 / 0.680424 (0.187268) 0.596349 / 0.534201 (0.062148) 0.349782 / 0.579283 (-0.229501) 0.395510 / 0.434364 (-0.038854) 0.237751 / 0.540337 (-0.302586) 0.243337 / 1.386936 (-1.143599)

CML watermark

Please sign in to comment.