New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add examples of different ways of creating string datasets #2424
base: master
Are you sure you want to change the base?
Conversation
string_data = ["varying", "sizes", "of", "strings"] | ||
|
||
# Variable length strings (implicit) | ||
f['vlen_strings1'] = string_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This f
is supposed to be the database? What is supposed to happen when you are doing it implicitly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f
for file - this is a convention across the h5py docs, e.g. look further down this page, or at the dataset page. We don't really use the word database at all; HDF5 also talks about files rather than databases.
'Implicit' here just means you're not telling h5py a dtype, so it's guessing based on the object you give it.
f['vlen_strings1'] = string_data | ||
|
||
# Variable length strings (explicit) | ||
ds = f.create_dataset('vlen_strings2', shape=4, dtype=h5py.string_dtype()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice if there were some examples with data=
keyword argument - those are the ones that got me confused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing data=
should be equivalent to the implicit case (f[name] =
) if you don't pass dtype=
, or to the explicit case if you do. I'm showing the explicit dtype case as two lines to illustrate that it lets you create the dataset before you have all the data.
I'd rather not make this example longer by showing more possible ways to do the same thing. The explanation at creating datasets could probably be improved as well.
You can use :func:`.string_dtype` to explicitly specify any HDF5 string datatype. | ||
|
||
:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use :func:`.string_dtype` to explicitly specify any HDF5 string datatype. | |
:: | |
You can use :func:`.string_dtype` to explicitly specify any HDF5 string datatype, | |
as shown in the examples below:: |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2424 +/- ##
=======================================
Coverage 89.58% 89.58%
=======================================
Files 17 17
Lines 2391 2391
=======================================
Hits 2142 2142
Misses 249 249 ☔ View full report in Codecov by Sentry. |
Building on #2423, trying to illustrate the different possible ways to create string datasets.