New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate generating string dtypes with unspecified length #2085
Deprecate generating string dtypes with unspecified length #2085
Conversation
The test failure is seemingly unrelated, in |
Closing and reopening to reset CI then 😅 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Thomas!
Thanks! I hope you've got safely back home by now. :-) |
Yep, landed yesterday and back at work for most of today... Jetlag a work in progress but trending in the right direction 🌍🌏 |
From discussion with @Zac-HD at Euroscipy. Generating arrays with string dtypes with
min_len=0
issues warnings like:This is because the value generation from
from_dtype
interprets 'S0' as meaning "strings of arbitrary length", whereas creating the empty array (likenp.zeros(dtype='S0', shape=10)
) sets the length of each string to 1.An alternative way to work around this would be to generate the necessary values first, and then create an array with them. But arrays are currently created before their values, and it appears there's quite a bit of complexity in how arrays are filled, so it's not trivial to reorder that operation.