Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8-bit enum type support? #1035

Open
d-chambers opened this issue Jul 21, 2023 · 10 comments
Open

8-bit enum type support? #1035

d-chambers opened this issue Jul 21, 2023 · 10 comments

Comments

@d-chambers
Copy link

Hello,

I am using pytables to read an HDF5 file created by another program. These files use 8-bit enums to indicate True (1) and False (0) values. Pytables doesn't like this, and I get hundreds of warnings like the following:

 DataTypeWarning: Unsupported type for attribute XXX in node '/'. Offending HDF5 class: 8

Looking at it in HDF5Viewer, these attributes look like this:

image

Is there a way to make these warnings go away, or, better yet, register this simple enum type so pytables can read in the correct boolean values?

Thanks!

@avalentino
Copy link
Member

Dear @d-chambers.
PyTables has a EnumAtom that in principle should be able to cope with the case that you describe.
Do you have a minimal sample file to share, so that I can give a look to it?

@d-chambers
Copy link
Author

Hi @avalentino,

Sure here is a file with a script to reproduce the warnings. Thanks!

pytables_issue_1035.zip

@d-chambers
Copy link
Author

Hey @avalentino, did you have a chance to take a look at this yet? Anything I can do to help?

Thanks!

@avalentino
Copy link
Member

Dear @d-chambers,
Sorry I didn't have the time to look at this.
I have been very busy in that last period, ... and also this week will be also quite full.

If my understanding is correct it should be possible to address the issue using the EnumAtom.
Have you tried to play with it?

@d-chambers
Copy link
Author

If my understanding is correct it should be possible to address the issue using the EnumAtom.
Have you tried to play with it?

I haven't yet but I will look at it later today. I can certainly define the Enum so it matches the info in HDF5 viewer but what isn't clear is if I need to plug it in somewhere so when I read an existing file pytables knows about it?

@d-chambers
Copy link
Author

d-chambers commented Aug 3, 2023

Ok, I played around with this a bit. It was easy to define an EnumAtom which looks right based on the output from hdf5viewer. However, I still need some guidance on how to get pytables to recognize the enum atom when it reads files because these warnings are issued when calling pytables.open_file with a problematic file.

When I filter the python warnings to raise an Error I see this TypeError from utilsextension.pyx:

E   TypeError: the HDF5 class ``H5T_ENUM`` is not supported yet

tables/utilsextension.pyx:1321: TypeError

Any advice on where to go from here?

@avalentino
Copy link
Member

OK, the problem seems to be a little bit more complex that expected.
The AttributeSet._g_getattr and the corresponding AttributeSet._g_setattr in tables/utilsextension.pyx do not handle the HDF5 type H5T_ENUM.

Probably a quick patch to return the integer value representing the enum could be relatively easy to implement but a more complete solution would require a more careful design.

Any idea is very welcome.

@d-chambers
Copy link
Author

It looks like h5py just returns the enum value (e.g, False/True). Is there a reason that would be undesirable?

@avalentino
Copy link
Member

It would be OK for the reading part, but it is not sufficient for writing.
If we just pass an (int)enum to the _setattr method we will not be able to figure out the datatype that the user expects to find in the HDF5 file.
Moreover, we ned to find a way to re-use the enum declaration once defined.

By the way just having the reading part with the enums would be a great starting point.
Would you like to provide a PR for it?

@d-chambers
Copy link
Author

Would you like to provide a PR for it?

Sure. I am on vacation the rest of this week but I will pick it up next week.

@avalentino avalentino modified the milestones: 3.9.0, 3.9.1, 3.9.2 Oct 6, 2023
@ivilata ivilata modified the milestones: 3.9.2, 3.9.3 Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
@ivilata @avalentino @d-chambers and others