Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added LipidSelection class #4302

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

sriraksha-srinivasan
Copy link

@sriraksha-srinivasan sriraksha-srinivasan commented Sep 29, 2023

Fixes #2082

Changes made in this Pull Request:

  • residue names from CHARMM top_all36_lipid.rtf

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Developers certificate of origin


📚 Documentation preview 📚: https://mdanalysis--4302.org.readthedocs.build/en/4302/

* residue names from CHARMM top_all36_lipid.rtf
@pep8speaks
Copy link

pep8speaks commented Sep 29, 2023

Hello @sriraksha-srinivasan! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 1063:2: E101 indentation contains mixed spaces and tabs
Line 1063:2: W191 indentation contains tabs
Line 1063:74: W291 trailing whitespace
Line 1064:1: E101 indentation contains mixed spaces and tabs
Line 1064:1: W191 indentation contains tabs
Line 1064:73: W291 trailing whitespace
Line 1065:1: W191 indentation contains tabs
Line 1065:73: W291 trailing whitespace
Line 1066:1: W191 indentation contains tabs
Line 1067:1: E101 indentation contains mixed spaces and tabs

Comment last updated at 2023-12-20 07:37:26 UTC

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello there first time contributor! Welcome to the MDAnalysis community! We ask that all contributors abide by our Code of Conduct and that first time contributors introduce themselves on the developer mailing list so we can get to know you. You can learn more about participating here. Please also add yourself to package/AUTHORS as part of this PR.

@github-actions
Copy link

github-actions bot commented Sep 29, 2023

Linter Bot Results:

Hi @sriraksha-srinivasan! Thanks for making this PR. We linted your code and found the following:

Some issues were found with the formatting of your code.

Code Location Outcome
main package ⚠️ Possible failure
testsuite ✅ Passed

Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/7272370645/job/19814375013


Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!

@codecov
Copy link

codecov bot commented Sep 29, 2023

Codecov Report

Attention: Patch coverage is 44.44444% with 5 lines in your changes are missing coverage. Please review.

Project coverage is 93.39%. Comparing base (b808ef1) to head (291571f).
Report is 38 commits behind head on develop.

Files Patch % Lines
package/MDAnalysis/core/selection.py 44.44% 5 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4302      +/-   ##
===========================================
- Coverage    93.41%   93.39%   -0.02%     
===========================================
  Files          171      185      +14     
  Lines        22512    23635    +1123     
  Branches      4129     4130       +1     
===========================================
+ Hits         21029    22074    +1045     
- Misses         963     1041      +78     
  Partials       520      520              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@orbeckst
Copy link
Member

@sriraksha-srinivasan would you be able to continue on this PR?

@MDAnalysis/coredevs could someone look after this hackathon PR? Some guidance on how to add tests, fix formatting, and add a CHANGELOG/AUTHORS entry would be good.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good start but we need a bit more so that users know that the lipid selection exists (add docs) and we need tests (to ensure that it works correctly now and in the future).

@@ -1056,6 +1056,28 @@ def _apply(self, group):
return group[np.isin(nmidx, matches)]


class LipidSelection(Selection):
token = 'lipid'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment from where you got the residue names — you explained it to me during the hackathon and that was very useful information, so capture it here in a comment!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orbeckst I'd be happy to continue working on this PR. I have currently added the lipid names for the charmm 36 ff from the list of lipids in charmm top_all36_lipid.rtf . The selection needs to be extended to other force fields too.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like some help with the tests, happy to discuss more in the developer channel on discord.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add some more general comments on testing in the PR below.

Add your sources as a comment in the code, along the lines of

# Lipid residue names from CHARMM's top_all36_lipid.rtf.
# NOTE: need to add names from other FFs.

@orbeckst orbeckst added the hackathon part of a MDAnalysis coding event label Oct 10, 2023
@orbeckst
Copy link
Member

orbeckst commented Oct 24, 2023

More about testing. Before getting started, two important points to consider:

  • Writing your first test is hard, but don't let this deter you! Use the docs and just understand that writing tests is something that you learn, not something you just know how to do.
  • Also be aware that often writing tests is more work than writing the actual code — scientific code needs to be correct and our tests are a crucial component of maintaining high-quality software.

What are tests and how do we write them? — pytest!

Tests are small pieces of code that

  1. run the functionality that you want to test (e.g., u.select_atoms("lipids"))
  2. compare the output to a known correct reference
  3. raise an AssertionError if the output disagrees with the reference

We are using the pytest framework, which is a bit difficult to understand at first, but I encourage you to start reading about it (and go back to the docs repeatedly, that's what I have to do all the time). The basic idea is that you write tests as functions like test_function(...) (see pytest: Getting Started)

def test_create_gro_Universe():
   u = mda.Universe("test.gro")   # need to have example file available
   assert u.atoms.n_atoms == 4314 # correct number of atoms in test.gro

pytest then automagically finds all these tests and runs them. (How to make example inputs available is a separate discussion: primarily we use "fixtures".)

You can also group multiple test functions into a class. We then have TestSomethingBigger (with methods test_thing_1(self,...), test_thing_2(self, ...) where each of the test_*() methods is a test like a test function.

You'll probably start by copying some existing code and then refining it.

Running tests locally

The User Guide has a section on testing. It's a bit outdated in parts but still a good place to start to get an idea.

You should be able to run tests locally to check that your new tests work as expected. When you have your checked-out sources and with a developer installation of MDAnalysis (i.e., you did pip install -e package/ from the top directory of the source) you should be able to run ALL tests with

pytest  --disable-pytest-warnings testsuite/MDAnalysisTests

This will take ~10 minutes while you're running ~20,000 tests.

When you're working on a specific feature, you don't want to run all tests every time you change something so you only run a specific test file. In your case, something like

pytest -v testsuite/MDAnalysisTests/core/test_atomselections.py

to run just the tests related to selections.

Writing tests for lipid selection

Look at the test for the protein selection

def test_protein(self, universe):

You'll have to write something similar but we can't just add it under the TestSelectionsCHARMM class.

class TestSelectionsCHARMM(object):

because this whole class uses an input file that only contains a protein. There's nothing there that your new lipid selection could select.

Class TestLipidSelection

I would recommend you start with a class TestLipidsSelection and then we can group everything related to lipids there.

class TestLipidSelection:
   # testing new lipid selection (issue #2082)

Test file: add a fixture

We have at least one example file with lipids, namely

"GRO_MEMPROT", "XTC_MEMPROT", # YiiP transporter in POPE:POPG lipids with Na+, Cl-, Zn2+ dummy model without water
so we start by creating a fixture to have universe available for our tests:

class TestLipidSelection:
   # testing new lipid selection (issue #2082)

   @pytest.fixture
   def u_lipids(self):
      return mda.Universe(GRO_MEMPROT)

Add GRO_MEMPROT to the import statement in

from MDAnalysis.tests.datafiles import (
so that it becomes available.

You can then use the fixture inside the class as u_lipids as I'll show below. (Have a look at the pytest docs on using fixtures.)

Adding the first test

Let's apply the selection and add a test:

class TestLipidSelection:
   # testing new lipid selection (issue #2082)

   @pytest.fixture
   def u_lipids(self):
      return mda.Universe(GRO_MEMPROT)

   def test_CHARMM_lipid_selection(self, u_lipids):
       lipids = u_lipids.select_atoms("lipid")
       reference = u_lipids.select_atoms("resname POPE POPG")
       assert lipids == reference

This should give you an idea how to structure the tests. Importantly, we use the fact that we know that the u_lipids universe contains lipids with resname POPE or POPG (actually, double check!) You can add more assertions.

Add tests for all residues in the lipid selection

The obvious problem with my first test is that it only checks a subset of your defined lipids. We could include example files that contain all lipids. But we can also generate a fake universe (using make_Universe) that contains what we expect to see and then run your selection on the fake universe. This is exactly what we do to do comprehensive testing of the protein selection:

@pytest.mark.parametrize('resname', sorted(MDAnalysis.core.selection.ProteinSelection.prot_res))
def test_protein_resnames(self, resname):
u = make_Universe(('resnames',))
# set half the residues' names to the resname we're testing
myprot = u.residues[::2]
# Windows note: the parametrized test input string objects
# are actually of type 'numpy.str_' and coercion to str
# proper is needed for unit test on Windows
myprot.resnames = str(resname)
# select protein
sel = u.select_atoms('protein')
# check that contents (atom indices) are identical afterwards
assert_equal(myprot.atoms.ix, sel.ix)

You should adapt the test_protein_resnames() test into a test_lipid_resnames() in your TestLipidSelection class.

@orbeckst
Copy link
Member

@sriraksha-srinivasan do you want to give it a try with the tests? If you make some changes I'll review them and give further guidance.

@sriraksha-srinivasan
Copy link
Author

sriraksha-srinivasan commented Dec 15, 2023

Hi @orbeckst , thanks a ton for the detailed info on tests, I would love to give it a try asap and get back to you.

@orbeckst
Copy link
Member

@sriraksha-srinivasan is this PR something that you still want to work on?

@orbeckst orbeckst added the close? Evaluate if issue/PR is stale and can be closed. label Mar 29, 2024
@orbeckst orbeckst self-assigned this Mar 29, 2024
@sriraksha-srinivasan
Copy link
Author

@sriraksha-srinivasan is this PR something that you still want to work on?

@orbeckst yes absolutely, I will update you here with the tests asap.

@orbeckst orbeckst removed the close? Evaluate if issue/PR is stale and can be closed. label Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component-Core enhancement hackathon part of a MDAnalysis coding event
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add "lipid" keyword to select_atoms
4 participants