Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate generating string dtypes with unspecified length #2085

Merged

Conversation

takluyver
Copy link
Contributor

From discussion with @Zac-HD at Euroscipy. Generating arrays with string dtypes with min_len=0 issues warnings like:

properties/test_pandas_roundtrip.py::test_roundtrip_pandas_dataframe
/home/takluyver/miniconda3/envs/hypothesise-xarray/lib/python3.7/site-packages/hypothesis/extra/numpy.py:147: HypothesisDeprecationWarning: Generated array element b'\xff\xc00\xff\xc8\xff\xb2\xff)\x00\xd6\xff\xff\xcb\xff' from binary().filter(lambda b: b[-1:] != b"\0").map(bytes_).map(convert_element) cannot be represented as dtype dtype('S') - instead it becomes b'\xff' (type <class 'numpy.bytes_'>). Consider using a more precise strategy, for example passing the width argument to floats(), as this will be an error in a future version.

This is because the value generation from from_dtype interprets 'S0' as meaning "strings of arbitrary length", whereas creating the empty array (like np.zeros(dtype='S0', shape=10)) sets the length of each string to 1.

An alternative way to work around this would be to generate the necessary values first, and then create an array with them. But arrays are currently created before their values, and it appears there's quite a bit of complexity in how arrays are filled, so it's not trivial to reorder that operation.

@takluyver
Copy link
Contributor Author

The test failure is seemingly unrelated, in test_can_find_unique_lists_of_non_set_order. This is marked with a @flaky decorator, so presumably it just flaked a bit too much.

@Zac-HD
Copy link
Member

Zac-HD commented Sep 6, 2019

Closing and reopening to reset CI then 😅

@Zac-HD Zac-HD closed this Sep 6, 2019
@Zac-HD Zac-HD reopened this Sep 6, 2019
Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Thomas!

@Zac-HD Zac-HD merged commit a1bd4b2 into HypothesisWorks:master Sep 9, 2019
@takluyver takluyver deleted the deprecate-empty-string-dtypes branch September 10, 2019 07:07
@takluyver
Copy link
Contributor Author

Thanks! I hope you've got safely back home by now. :-)

@Zac-HD
Copy link
Member

Zac-HD commented Sep 10, 2019

Yep, landed yesterday and back at work for most of today... Jetlag a work in progress but trending in the right direction 🌍🌏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants