You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to use netCDF4-python (as backend to Xarray) to read some HDF5 files, and am unable to do so. Attempting to read the files actually crashes Python. I've traced the problem to a dimension scale with null dataspace in the HDF5 files. I understand that not all HDF5 files are netCDF4 files, but I don't think they should crash Python.
And in this particular case, the HDF5 file seems perfectly interpretable. As an enhancement to netCDF4-python, you could interpret a dimension scale with null dataspace for what it is equivalent to in netCDF4, which is "a netCDF dimension but not a netCDF variable."
Here is a reproducible example of code that crashes Python. I'm not totally sure the problem isn't just a mismatch between the HDF5 libraries used, since both netCDF4-python and h5py package their own libraries. My installs built nothing from source.
% cat danger.py
from h5py import File
from netCDF4 import Dataset
with File('danger.h5', 'w') as group:
dataset = group.create_dataset('y', shape=(3,), dtype=float)
dimension = group.create_dataset('x', shape=None, dtype=int) # will crash python when read below
# dimension = group.create_dataset('x', shape=(3,), dtype=int) # creates misleading dataset
dimension.make_scale('x')
dataset.dims[0].attach_scale(dimension)
with Dataset('danger.h5') as group:
print(group)
% python danger.py
Assertion failed: (ndims), function get_scale_info, file hdf5open.c, line 1396.
zsh: abort python danger.py
Here is the complete h5dump of danger.h5 created by h5py. While it is not a netCDF4 file, I can't think of any reason netCDF4-python shouldn't interpret it correctly (as it does in the above code but using the commented line). It is a dimension that has no coordinates, which is valid in the netCDF4 model.
Thank you for considering! Here are my versions ...
% pip list
Package Version
---------- -------
cftime 1.6.2
h5py 3.7.0
netCDF4 1.6.2
numpy 1.23.5
pip 22.1.2
setuptools 62.3.3
wheel 0.37.1
[notice] A new release of pip available: 22.1.2 -> 22.3.1
[notice] To update, run: pip install --upgrade pip
% python --version
Python 3.10.8
% sw_vers
ProductName: macOS
ProductVersion: 12.6.1
BuildVersion: 21G217
The text was updated successfully, but these errors were encountered:
itcarroll
changed the title
support for HDF5 dimension scales with an empty/null dataspace
support for HDF5 dimension scales with null dataspace
Dec 12, 2022
I would like to use netCDF4-python (as backend to Xarray) to read some HDF5 files, and am unable to do so. Attempting to read the files actually crashes Python. I've traced the problem to a dimension scale with null dataspace in the HDF5 files. I understand that not all HDF5 files are netCDF4 files, but I don't think they should crash Python.
And in this particular case, the HDF5 file seems perfectly interpretable. As an enhancement to netCDF4-python, you could interpret a dimension scale with null dataspace for what it is equivalent to in netCDF4, which is "a netCDF dimension but not a netCDF variable."
Here is a reproducible example of code that crashes Python. I'm not totally sure the problem isn't just a mismatch between the HDF5 libraries used, since both netCDF4-python and h5py package their own libraries. My installs built nothing from source.
Here is the complete h5dump of
danger.h5
created byh5py
. While it is not a netCDF4 file, I can't think of any reason netCDF4-python shouldn't interpret it correctly (as it does in the above code but using the commented line). It is a dimension that has no coordinates, which is valid in the netCDF4 model.Thank you for considering! Here are my versions ...
The text was updated successfully, but these errors were encountered: