Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/dssp #4304

Open
wants to merge 78 commits into
base: develop
Choose a base branch
from
Open

Feature/dssp #4304

wants to merge 78 commits into from

Conversation

marinegor
Copy link
Contributor

@marinegor marinegor commented Sep 29, 2023

Fixes #1612

Changes made in this Pull Request:

  • introduces MDAnalysis.analysis.dssp.DSSP class for secondary structure analysis, using code implemented in pydssp package available for secondary structure annotation
  • adds tests from pydssp package

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Developers certificate of origin


📚 Documentation preview 📚: https://mdanalysis--4304.org.readthedocs.build/en/4304/

@pep8speaks
Copy link

pep8speaks commented Sep 29, 2023

Hello @marinegor! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 2:80: E501 line too long (90 > 79 characters)
Line 14:80: E501 line too long (87 > 79 characters)
Line 16:80: E501 line too long (115 > 79 characters)
Line 27:80: E501 line too long (83 > 79 characters)
Line 48:80: E501 line too long (80 > 79 characters)
Line 64:80: E501 line too long (80 > 79 characters)
Line 73:80: E501 line too long (84 > 79 characters)
Line 80:80: E501 line too long (83 > 79 characters)
Line 82:80: E501 line too long (84 > 79 characters)
Line 84:80: E501 line too long (80 > 79 characters)
Line 190:80: E501 line too long (80 > 79 characters)
Line 217:80: E501 line too long (80 > 79 characters)
Line 221:80: E501 line too long (81 > 79 characters)
Line 223:80: E501 line too long (81 > 79 characters)
Line 257:80: E501 line too long (88 > 79 characters)
Line 282:80: E501 line too long (86 > 79 characters)
Line 288:80: E501 line too long (82 > 79 characters)
Line 301:80: E501 line too long (89 > 79 characters)
Line 309:80: E501 line too long (91 > 79 characters)
Line 338:80: E501 line too long (81 > 79 characters)

Line 68:80: E501 line too long (85 > 79 characters)
Line 157:80: E501 line too long (84 > 79 characters)
Line 179:80: E501 line too long (83 > 79 characters)
Line 206:80: E501 line too long (80 > 79 characters)
Line 207:80: E501 line too long (81 > 79 characters)

Line 10:80: E501 line too long (81 > 79 characters)
Line 11:80: E501 line too long (85 > 79 characters)
Line 12:80: E501 line too long (82 > 79 characters)
Line 30:80: E501 line too long (80 > 79 characters)
Line 42:80: E501 line too long (80 > 79 characters)
Line 52:80: E501 line too long (80 > 79 characters)
Line 55:80: E501 line too long (82 > 79 characters)

Comment last updated at 2024-05-29 07:17:22 UTC

@github-actions
Copy link

github-actions bot commented Sep 29, 2023

Linter Bot Results:

Hi @marinegor! Thanks for making this PR. We linted your code and found the following:

Some issues were found with the formatting of your code.

Code Location Outcome
main package ⚠️ Possible failure
testsuite ⚠️ Possible failure

Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/9281867626/job/25538692880


Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!

@marinegor
Copy link
Contributor Author

marinegor commented Sep 29, 2023

There's a problem I don't know the workaround for yet: prolines.

In the original pydssp following happens:

  • atoms get parsed from pdb regardless of residue type
  • hydrogens get guilt automatically
  • based on that, secondary structure is assigned

However, in my implementation all prolines (and N-terminal residue) are assigned '-' (=unstructured loop).

This is likely less correct than original implementation, but I'm not sure how to fix that.
My suggestion would be to:

  • if guess_hydrogens=False, use positions of existing hydrogens and fill in fake proline hydrogens with pydssp
  • if guess_hydrogens=True, default to pydssp behaviour with fully automatic hydrogens

@marinegor
Copy link
Contributor Author

The prolines problem is now dealt with, exactly as I described above -- if guess_hydrogens=False, it would take hydrogens from atoms that have them, and otherwise (=prolines & N-term) guess their positions. Otherwise, would guess all the positions.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks already very promising.

For an initial quick review please see inline comments.

Additional requests

  • Fix the tests: AttributeError: module 'MDAnalysisTests.datafiles' has no attribute 'DSSP'
  • We also need a test with a trajectory.
  • Compress the PDF files with bz2 or gz.
  • Update CHANGELOG.
  • Add a .. versionadded:: 2.7.0 to the docs.
  • Fix PEP8 complaints.

Initially distinguishing H,E,- is good but eventually it would be good to distinguish other secondary structures, too, in particular 310 and π helices.

package/MDAnalysis/analysis/dssp.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/dssp.py Outdated Show resolved Hide resolved
Comment on lines 6 to 9
CONST_Q1Q2 = 0.084
CONST_F = 332
DEFAULT_CUTOFF = -0.5
DEFAULT_MARGIN = 1.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these constants from from the original DSSP algorithm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes -- see "Hydrogen-Bonded Structure" part of the paper.

package/MDAnalysis/analysis/dssp.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/dssp.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/dssp.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/dssp.py Outdated Show resolved Hide resolved
Comment on lines 152 to 153
h3 = h3 * ~np.roll(helix4, -1, 1) * ~helix4 # helix4 is higher prioritized
h5 = h5 * ~np.roll(helix4, -1, 1) * ~helix4 # helix4 is higher prioritized
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get 3_10 and π helix, too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not certain, but seems like yes (used this paper's Fig 1):

from MDAnalysis.analysis.dssp import DSSP
import MDAnalysis as mda

u = mda.Universe('./1FUR.pdb')
r = DSSP(u).run()

r.results.resids[151:158]
# array([155, 156, 157, 158, 159, 160, 161])

''.join(r.results.dssp[0])[151:158]
# 'HHHH-HH'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it can detect 3-10 and pi helixes indeed -- I generated idealized structures with mmtbx from here and ran DSSP on them:

dssp/o_beta_seq.pdb           --------------------
dssp/o_helix310_seq.pdb    -HHHHHHHHHHHHHHHHHH-
dssp/o_helix_seq.pdb          -HHHHHHHHHHHHHHHHHH-
o_helix_pi_seq.pdb              -HHHHHHHHHHHHHHHHHH-

package/MDAnalysis/analysis/dssp.py Outdated Show resolved Hide resolved
testsuite/MDAnalysisTests/analysis/test_dssp.py Outdated Show resolved Hide resolved
@orbeckst orbeckst self-assigned this Oct 10, 2023
@orbeckst orbeckst linked an issue Oct 10, 2023 that may be closed by this pull request
@orbeckst orbeckst added the hackathon part of a MDAnalysis coding event label Oct 10, 2023
@marinegor
Copy link
Contributor Author

marinegor commented Apr 8, 2024

Hi @yuxuanzhuang and @orbeckst , thanks for your review! Main changes:

  • now works with AtomGroup (and there's a test for that)
  • there are no more dssp-related errors when doing sphinx doctest
  • moved # pragma: no cover to the codepaths that actually are not tested.

Hope that works!

UPD: also again, CI/CD failures are codecov-related, as far as I can see.

Copy link
Contributor

@yuxuanzhuang yuxuanzhuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've corrected a missing reference bracket; everything else seems fine to me! It's a great addition to MDAnalysis!

'--EEEE----------HHHH'


Functions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Oliver means moving the section of function (i.e. this block) in the documentation after Analysis classes

@marinegor
Copy link
Contributor Author

@yuxuanzhuang I don't get which section you mean. If this one: https://mdanalysis--4304.org.readthedocs.build/en/4304/documentation_pages/analysis_modules.html
Then DSSP is already there, under "protein structure"

@yuxuanzhuang
Copy link
Contributor

@yuxuanzhuang I don't get which section you mean. If this one: https://mdanalysis--4304.org.readthedocs.build/en/4304/documentation_pages/analysis_modules.html Then DSSP is already there, under "protein structure"

modified :)

@marinegor
Copy link
Contributor Author

ah yeah I see now, my bad :) thanks a lot!

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the work here @marinegor.

Please consider anything covered by [enh] as enhancements - i.e. nice to haves.

There are a few small blockers here that need addressing.

A note: this adds roughly ~ 1 MB of extra compressed data to the testsuite. I don't consider this a blocker, but something to keep in mind (currently we are at 56 MB compressed).

LICENSE Show resolved Hide resolved
@@ -39,6 +39,7 @@ Fixes
* Fix groups.py doctests using sphinx directives (Issue #3925, PR #4374)

Enhancements
* Add `analysis.DSSP` module for protein secondary structure assignment, based on [pydssp](https://github.com/ShintaroMinami/PyDSSP)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to block on this, but if you can wrap around at 79 chars here that would be grand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

mean_secondary_structure = translate(
long_run.results.dssp_ndarray.mean(axis=0)
)
print(''.join(mean_secondary_structure)[:20])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm re-opening this, I had the exact same question as @orbeckst reading through this. I'm not sure if the 20 residues clarification got removed later for other reasons, so I'm not going to block over this.


Running this code produces ::

'-EEEEEE------HHHHHHH'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are better ways of doing this, especially if we want doctest to run well over this.

Something like this:

In the simplest case, we can simply calculate the C-alpha RMSD between
two structures, using :func:`rmsd`::
>>> ref = mda.Universe(PDB_small)
>>> mobile = mda.Universe(PSF, DCD)
>>> rmsd(mobile.select_atoms('name CA').positions, ref.select_atoms('name CA').positions)
28.20178579474479

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my understanding was the opposite, and I recall that code-block:: python actually gives some errors in doctest. I'll rewrite if necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of dislike the >>> form for longer code examples, at least if I as a human have to read it.

@ljwoods2 can I pick your brain? I think you recently wrote doctests that did not use >>>. How did you do that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike >>> too, perhaps this was the reason for me thinking that code-block syntax was correct.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In this example, we will find residue groups that maintain their secondary structure
along the simulation, and have some meaningful ('E' or 'H') secondary structure
during more than set `threshold` share of frames. We will call these residues
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
during more than set `threshold` share of frames. We will call these residues
during more than set `threshold` fraction of the total frames. We will call these residues

Feel free to reject, it's just a nitpick - reading this I wasn't sure if the code below was going to expect a set number of frames or a fraction of the total number of frames.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, thanks.

package/MDAnalysis/analysis/dssp/dssp.py Show resolved Hide resolved
@@ -72,6 +72,7 @@ include=[
"MDAnalysisTests.data.gromacs",
"MDAnalysisTests.data.lammps",
"MDAnalysisTests.data.mol2",
"MDAnalysisTests.data.dssp",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This specific folder location is missing an __init__.py file - could you add one please?

Copy link
Contributor Author

@marinegor marinegor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments, will fix the rest (selections, hydrogens, and documentation) later.

LICENSE Show resolved Hide resolved
@@ -39,6 +39,7 @@ Fixes
* Fix groups.py doctests using sphinx directives (Issue #3925, PR #4374)

Enhancements
* Add `analysis.DSSP` module for protein secondary structure assignment, based on [pydssp](https://github.com/ShintaroMinami/PyDSSP)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

mean_secondary_structure = translate(
long_run.results.dssp_ndarray.mean(axis=0)
)
print(''.join(mean_secondary_structure)[:20])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a little comment about that in the code snippet.

My reasoning behind [:20] was that the full output is this:

--EEEE----------HHHHHHH----EE----HHHHH------HHHHHHHHHH------HHHHHHHHHHH---------EEEE-----HHHHHHHHH------EEEEEE--HHHHHH----EE--------EE---E----------------------HHHHHHHHHHHHHHHHHHHHHHHHHHHH----EEEEE------HHHHHHHHH--

which is kinda lengthy for the documentation🤷‍♂️


Running this code produces ::

'-EEEEEE------HHHHHHH'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my understanding was the opposite, and I recall that code-block:: python actually gives some errors in doctest. I'll rewrite if necessary.

package/MDAnalysis/analysis/dssp/dssp.py Show resolved Hide resolved
self._guess_hydrogens = guess_hydrogens

ag: AtomGroup = (
atoms.select_atoms("protein") if isinstance(atoms, Universe) else atoms
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens here if you pick up a amino acid residue with more protons than is "normally" expected? E.g. ASH or GLUH

I have requested the files of this type on discord, but if you have any in mind, please let me know.

Given you end up doing a lot of protein residue selection below, would it make sense to just do it here once?

Perhaps, yes. I'll try refactoring it.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing my last comments. I found minor things and added a whole bunch of code changes — please add those together with the ones from @IAlibay 's review. Thanks!

package/MDAnalysis/analysis/dssp/dssp.py Outdated Show resolved Hide resolved
mean_secondary_structure = translate(
long_run.results.dssp_ndarray.mean(axis=0)
)
print(''.join(mean_secondary_structure)[:20])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just show everything to make clearer what's happening, especially without explanation? Please remove the [:20] and add the full output. It's not too long.


Running this code produces ::

'-EEEEEE------HHHHHHH'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of dislike the >>> form for longer code examples, at least if I as a human have to read it.

@ljwoods2 can I pick your brain? I think you recently wrote doctests that did not use >>>. How did you do that?



class DSSP(AnalysisBase):
"""Assign secondary structure using DSSP algorithm.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include please

package/MDAnalysis/analysis/dssp/dssp.py Show resolved Hide resolved
package/MDAnalysis/analysis/dssp/pydssp_numpy.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/dssp/pydssp_numpy.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/dssp/pydssp_numpy.py Outdated Show resolved Hide resolved
testsuite/MDAnalysisTests/analysis/test_dssp.py Outdated Show resolved Hide resolved
Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are minor doc issues to fix, please see comments (incorrect reST syntax, missing print(), ...)

Please also address other reviews.

I am going to approve because in principle I am satisfied. Good work!

package/MDAnalysis/analysis/dssp/dssp.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/dssp/dssp.py Show resolved Hide resolved
package/MDAnalysis/analysis/dssp/dssp.py Show resolved Hide resolved
package/MDAnalysis/analysis/dssp/dssp.py Outdated Show resolved Hide resolved
test_dssp.py Outdated Show resolved Hide resolved
manual test a top level, should not be in source
@orbeckst
Copy link
Member

orbeckst commented May 3, 2024

@IAlibay would you mind taking a quick final glance at the DSSP PR?

I think your comments were addressed but would prefer you confirm instead of me dismissing a totally valid review.

Would be great if we could finally get that one in.

@marinegor
Copy link
Contributor Author

@IAlibay any comments on this one?

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the long review delay.

I just have the one docstring confusion, once addressed then it's good to go.

Tagging @orbeckst here - please do dismiss my review if this gets addressed and I don't get a chance to re-review.

for calpha, hydrogen in zip(
self._heavy_atoms["CA"][1:], self._hydrogens[1:]
):
if not hydrogen and calpha.resname != "PRO":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the warning, however I'm still a bit confused with this https://github.com/MDAnalysis/mdanalysis/pull/4304/files#diff-cad8095a75b51cba3d6ef256c3c05073b457f309406ae4a94f45ac53257f3e81R202

It claims we're checking for "H" but we only expect one "H" according to the warning? (sorry I'm probably misreading something).

@marinegor
Copy link
Contributor Author

Hi @IAlibay -- couldn't follow your link, maybe you explain in more detail? I updated class's documentation slightly to match the description of the run method, but not sure if it solves your confusion

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marinegor and @IAlibay I think I understand the point of your discussion: As far as I can tell, I agree with @IAlibay that the current ValueError check does not exactly check that there's exactly one HN. If we could change the code then that would be safer, please see comments.

I think I also found a potential issue with how _heavy_atoms and _hydrogens are constructed. Please check.

Comment on lines +187 to +195
.. Warning::
For the DSSP to work properly, your atoms (either as ``Universe``
or ``AtomGroup``) must have only one hydrogen atom "H" per residue,
that corresponds to N-bound hydrogen in a peptide bond (or zero hydrogen
atoms, if the residue is proline). It is selected using `hydrogen_atom`
parameter. It is designed to hande most of the modern syntaxes,
but if you encounter an error or unusual results
during your run, please report an issue on MDAnalysi
`github <https://github.com/MDAnalysis/mdanalysis/issues>`_.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the following is clearer?

Suggested change
.. Warning::
For the DSSP to work properly, your atoms (either as ``Universe``
or ``AtomGroup``) must have only one hydrogen atom "H" per residue,
that corresponds to N-bound hydrogen in a peptide bond (or zero hydrogen
atoms, if the residue is proline). It is selected using `hydrogen_atom`
parameter. It is designed to hande most of the modern syntaxes,
but if you encounter an error or unusual results
during your run, please report an issue on MDAnalysi
`github <https://github.com/MDAnalysis/mdanalysis/issues>`_.
.. Warning::
For DSSP to work properly, your atoms must represent a protein. The
hydrogen atom bound to the backbone nitrogen atom is matched by name
as given by the keyword argument `hydrogen_atom`. There may only be
a single backbone nitrogen hydrogen atom per residue; the one exception
is proline, for which there should not exist any such hydrogens.
The default value of `hydrogen_atom` should to handle the common naming
conventions in the PDB and in force fields but if you encounter an error
or unusual results during your run, try to figure out how to select the
correct hydrogen atoms and report an issue in the MDAnalysis
`issue tracker <https://github.com/MDAnalysis/mdanalysis/issues>`_.

.. Note::
To work with different hydrogen-naming conventions by default, the default
selection is broad but if hydrogens are incorrectly selected (e.g., a
:exc:`ValueError` is raised) you must customize ``hydrogen_name`` for your
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reference kwargs by single backticks

Suggested change
:exc:`ValueError` is raised) you must customize ``hydrogen_name`` for your
:exc:`ValueError` is raised) you must customize `hydrogen_name` for your

for calpha, hydrogen in zip(
self._heavy_atoms["CA"][1:], self._hydrogens[1:]
):
if not hydrogen and calpha.resname != "PRO":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning says that we expect exactly one H for non-PRO. Can you write the if with the following logic

              if calpha.resname == "PRO":
                  # special case PRO
                  if hydrogen:
                       raise ValueError(f"PRO {calpha.resid} has a hydrogen {hydrogen}")
                  # else:
                  #     pass  # expected: PRO has no hydrogen
              elif not hydrogen or len(hydrogen) > 1:
                  # not a PRO and not exactly one HN: that's bad
                  raise ValueError(f"{calpha.resname} {calpha.resid} has not exactly one HN, instead it has {hydrogen}")

At least that's my understanding what we should be seeing, @marinegor ?

(There's better code to be written than what I wrote but for following the logic I inserted a pass etc...)

Comment on lines +278 to +289
self._heavy_atoms: dict[str, "AtomGroup"] = {
t: ag.atoms[
np.isin(
ag.names, t.split()
) # need split() since `np.isin` takes an iterable as second argument
# and "N".split() -> ["N"]
]
for t in heavyatom_names
}
self._hydrogens: list["AtomGroup"] = [
res.atoms.select_atoms(f"name {hydrogen_name}") for res in ag.residues
]
Copy link
Member

@orbeckst orbeckst May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the following code I see that there's the assumption that the order of atoms in _heavy_atoms["CA"] is the same as in _hydrogens.

However, that may not be true: _hydrogens is created from a select_atoms() which always sorts the atoms by index in increasing order whereas ag's in _heavy_atoms are created by look up and slicing. If the original input ag is NOT ORDERED then _hydrogens will be ORDERED whereas _heavy_atoms remain in the CUSTOM ORDER of the original ag. At least that's what it looks like to me.

It may be safer to just sort the original ag first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component-Analysis enhancement hackathon part of a MDAnalysis coding event
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for secondary structure identification
6 participants