Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

carray data written with blosc2 and fletcher32 filters will fail fletcher32 checksum on read #1162

Open
alasla opened this issue Apr 22, 2024 · 0 comments

Comments

@alasla
Copy link

alasla commented Apr 22, 2024

tables is perfectly able to read blosc2:lz4 data from previous versions of tables. Previous versions were:
tables: 3.8
blosc2: 2.0.0

With newest tables and blosc2 releases, when I write, and then try to read a carray written with blosc2:lz4 and fletcher32 filters, I get a fletcher32 checksum error.

tables 3.9.2
blosc2 2.6.2
numpy 1.26.4

reproduction code

import tables
import numpy as np
import blosc2
print(tables.__version__)
print(np.__version__)
print(blosc2.__version__)

input_arr = np.array([[0, 1, 1], [1, 0, 1]], dtype='int8')
print(input_arr)

f = tables.open_file('/tmp/tables.h5', mode='w')
print(repr(f))

filters = tables.Filters(complevel=9, complib='blosc2:lz4', shuffle=False, fletcher32=True)
print(repr(filters))

arr = f.create_carray('/', 'data', tables.Int8Atom(), input_arr.shape, filters=filters, chunkshape=None, obj=input_arr)
print(repr(arr))

f.close()

f = tables.open_file('/tmp/tables.h5', mode='r')
print(repr(f))

n = f.get_node('/data')
print(repr(n))

n.read()

results in an exception at n.read():

---------------------------------------------------------------------------
HDF5ExtError                              Traceback (most recent call last)
Cell In[31], line 28
     25 n = f.get_node('[/data]
     26 print(repr(n))
---> 28 n.read()

File [/usr/local/rbpython/lib/python3.10/site-packages/tables/array.py:866] in Array.read(self, start, stop, step, out)
    864     raise TypeError(msg)
    865 (start, stop, step) = self._process_range_read(start, stop, step)
--> 866 arr = self._read(start, stop, step, out)
    867 return internal_to_flavor(arr, self.flavor)

File [/usr/local/rbpython/lib/python3.10/site-packages/tables/array.py:823] in Array._read(self, start, stop, step, out)
    820 # Protection against reading empty arrays
    821 if 0 not in shape:
    822     # Arrays that have non-zero dimensionality
--> 823     self._read_array(start, stop, step, arr)
    824 # data is always read in the system byteorder
    825 # if the out array's byteorder is different, do a byteswap
    826 if (out is not None and
    827         byteorders[arr.dtype.byteorder] != sys.byteorder):

File [/usr/local/rbpython/lib/python3.10/site-packages/tables/hdf5extension.pyx:1593] in tables.hdf5extension.Array._read_array()

HDF5ExtError: HDF5 error back trace

  File "H5D.c", line 1061, in H5Dread
    can't synchronously read data
  File "H5D.c", line 1008, in H5D__read_api_common
    can't read data
  File "H5VLcallback.c", line 2092, in H5VL_dataset_read_direct
    dataset read failed
  File "H5VLcallback.c", line 2048, in H5VL__dataset_read
    dataset read failed
  File "H5VLnative_dataset.c", line 363, in H5VL__native_dataset_read
    can't read data
  File "H5Dio.c", line 383, in H5D__read
    can't read data
  File "H5Dchunk.c", line 2856, in H5D__chunk_read
    unable to read raw data chunk
  File "H5Dchunk.c", line 4468, in H5D__chunk_lock
    data pipeline read failed
  File "H5Z.c", line 1391, in H5Z_pipeline
    filter returned failure during read
  File "H5Zfletcher32.c", line 103, in H5Z__filter_fletcher32
    data error detected by Fletcher32 checksum

End of HDF5 error back trace

Problems reading the array data.

Works fine if I use zlib instead of blosc2. With fletcher32 checksum off, I don't see any differences in the read array from the input array that would indicate the data had been corrupted by blosc2...

Let me know if I can provide any other data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant