Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H5MD Can not read observables/<group>/<observable>/... #4598

Open
PythonFZ opened this issue May 17, 2024 · 7 comments
Open

H5MD Can not read observables/<group>/<observable>/... #4598

PythonFZ opened this issue May 17, 2024 · 7 comments
Labels
defect Format-H5MD hdf5-based H5MD trajectory format

Comments

@PythonFZ
Copy link

Expected behavior

The H5MD standard defines two main ways to define observables. Either directly as group below the root observables or as a group specific observable observables/group.

observables
 \-- <observable1>
 |    \-- step: Integer[variable]
 |    \-- time: Float[variable]
 |    \-- value: <type>[variable]
 \-- <group1>
 |    \-- <observable2>
 |         \-- step: Integer[variable]
 |         \-- time: Float[variable]
 |         \-- value: <type>[variable][D][D]

Both should be supported.

Actual behavior

In my experience only the observables after root are supported but observables/group are not. It leads to KeyError: "Unable to synchronously open object (object 'value' doesn't exist)". This seems to be briefly discussed in #4320 but I wanted to give more information here.

Code to reproduce the behavior

I've attached a file licl_100.h5.zip because the cobrotoxin.h5md test file only contains observables/lambda but no group specific observables.

import MDAnalysis as mda

u = mda.Universe.empty(100, n_residues=100, atom_resindex=np.arange(100), trajectory=True)
reader = H5MDReader("licl_100.h5", convert_units=False)
u.trajectory = reader
....

Current version of MDAnalysis

  • Which version are you using? (run python -c "import MDAnalysis as mda; print(mda.__version__)") 2.7.0
  • Which version of Python (python -V)? 3.10.9
  • Which operating system? Ubuntu 22
@orbeckst orbeckst added the Format-H5MD hdf5-based H5MD trajectory format label May 30, 2024
@orbeckst
Copy link
Member

@edisj @ljwoods2 is this issue a known limitation in our H5MD implementation?

@edisj
Copy link
Contributor

edisj commented May 30, 2024

yes, this is related to point 1) in #4561

basically, /particles can contain arbitrary "atom groups" like /particles/group_1/, /particles/group_2/, etc... that contain the position, velocity, and force data for specific subsets of particles group_1, group_2. If these exist, the /observables group can contain corresponding /observables/group_1/*/value, /observables/group_2/*/value, where * can be whatever you calculate (e.g. total energy, temperature, ...) that corresponds to a calculation over specific atom groups group_1, group_2, ...

At the moment, H5MDReader assumes there is only 1 group in /particles, and all of the observables in /observables correspond to that single particle group

@edisj
Copy link
Contributor

edisj commented May 30, 2024

the same thing is happening in #4320

@orbeckst
Copy link
Member

@PythonFZ how would you expect MDAnalysis to support groups, i.e., what exactly should be happening?

@orbeckst
Copy link
Member

orbeckst commented Jun 1, 2024

I could reproduce with

import MDAnalysis as mda
import numpy as np
u = mda.Universe.empty(100, n_residues=100, atom_resindex=np.arange(100), trajectory=True)
u.load_new("licl_100.h5", format="H5MD", convert_units=False)

and failed with

KeyError: "Unable to synchronously open object (object 'value' doesn't exist)"

If this is an otherwise legal H5MD file then we should be able to at least read the particles information, even if we ignore observables. So this may count as a bug in our H5MD reader and I tentatively assign the defect label.

However, I tried reading licl_100.h5 with the reference implementation (read.py from pyh5md)

 python read.py licl_100.h5
Traceback (most recent call last):
  File "~/tmp/h5md/read.py", line 17, in <module>
    with File(args.file, 'r') as f:
         ^^^^^^^^^^^^^^^^^^^^
  File "~/anaconda3/envs/mda311/lib/python3.11/site-packages/pyh5md/h5md_module.py", line 236, in __init__
    raise KeyError("h5md group not found in file")
KeyError: 'h5md group not found in file'

and note that @PythonFZ 's H5MD file cannot be loaded by pyh5md, presumably because it is missing the mandatory h5md root level group (see https://www.nongnu.org/h5md/h5md.html#h5md-root-level ). I don't have time to dissect the h5 file further so could you please comment and demonstrate that the test file does indeed fully conform to the spec?

@edisj
Copy link
Contributor

edisj commented Jun 1, 2024

and note that @PythonFZ 's H5MD file cannot be loaded by pyh5md, presumably because it is missing the mandatory h5md root level group (see https://www.nongnu.org/h5md/h5md.html#h5md-root-level ).

that's interesting... should our implementation also fail immediately if the root h5md group is missing?

@orbeckst
Copy link
Member

orbeckst commented Jun 1, 2024

yes — it's mandatory

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect Format-H5MD hdf5-based H5MD trajectory format
Projects
None yet
Development

No branches or pull requests

3 participants