Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Survival probability: Intermittency, Step, Residues, Overlapping SP Selection Example #2226

Merged
merged 20 commits into from Apr 6, 2019

Conversation

bieniekmateusz
Copy link
Member

@bieniekmateusz bieniekmateusz commented Mar 21, 2019

Fixes #

  • "Step" parameter does not load frames which are not used (dependant on the value of tau_max). This is great of sampling long trajectories.

Changes made in this Pull Request:

  • residue-wise SP can be carried out
  • Intermittency is defined as a number of consecutive absences allowed
  • Intermittency combined with Step lead to consistent behaviour
  • the raw data are made available (helps with testing, allows building upon the values)
  • simplified and extended documentation to include SP around overlapping areas (for example, headgroups in lipids)
  • Test Cases for all new functionalities have been added

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

We generalised the code to be applicable to other situations which is why we would like to ask if we should move it out of waterdynamics.py to its own file (survivalprobability.py in analysis).

The intermittency, for the sake of simplicity, is defined in terms of consecutive absences. This is calculated in a single pass over the data.

We've used the new functionalities and plan to publish on the topic to discuss the usefulness of these small changes.

This request is done on behalf of me and @p-j-smith

…nalysis. Before, all frames were loaded (and selection was always applied). Now, the frames which are not used are not loaded, thus improving the performance. This only happens when the step is larger than the tau_max + 1.

Test cases cover the situation where some frames are skipped. The test checks how many times the 'select_atoms' was called.

This is to help with the large size MD simulations analysis.
 was set up or not. Also, manually moving between the frames rather
 then using "for _ in trajectory[]". This removed unnecessary variables.
border condition for "tau_max" and "step".
as a gap: meaning that setting it to the value of 2 means that the atom
id 7, when in the sequence 7,X,X,7, where X means absence, will be
rewritten to 7,7,7,7. This way the intermittency does not affect the
actual routines for SP calculation. The array of IDs should never
be big so this should not add any significant computational time.
to the selected IDs, using the selected IDs to verify the behaviour of
the intermittency.
parameter is complex and was accounted for here, together with new
test cases. If necessary, we load extra frames for each window,
to account for the borders: for the first and last frame in a window.
analysis. This requires several runs and averaging.
@codecov
Copy link

codecov bot commented Mar 21, 2019

Codecov Report

Merging #2226 into develop will decrease coverage by 1.12%.
The diff coverage is 95.74%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2226      +/-   ##
===========================================
- Coverage    90.72%   89.59%   -1.13%     
===========================================
  Files           15      158     +143     
  Lines         1983    19725   +17742     
  Branches         0     2780    +2780     
===========================================
+ Hits          1799    17672   +15873     
- Misses         184     1458    +1274     
- Partials         0      595     +595
Impacted Files Coverage Δ
package/MDAnalysis/analysis/waterdynamics.py 82.68% <95.74%> (ø)
util.py 88.14% <0%> (-0.02%) ⬇️
package/MDAnalysis/coordinates/GRO.py 93.87% <0%> (ø)
package/MDAnalysis/coordinates/base.py 93.3% <0%> (ø)
package/MDAnalysis/analysis/helanal.py 85.19% <0%> (ø)
package/MDAnalysis/topology/MinimalParser.py 100% <0%> (ø)
package/MDAnalysis/visualization/__init__.py 100% <0%> (ø)
package/MDAnalysis/core/qcprot.py 100% <0%> (ø)
...e/MDAnalysis/analysis/encore/clustering/cluster.py 96.22% <0%> (ø)
package/MDAnalysis/coordinates/DCD.py 97.16% <0%> (ø)
... and 135 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1be9f73...657f918. Read the comment docs.

@bieniekmateusz
Copy link
Member Author

Hi there, I only modified two files, however, I worked on the "develop" branch by mistake. So I tried to rebase the code to bring my commits to the front, which is why the review picks up so many files.

I'll see if I can move the commits differently to avoid what happened here.

@bieniekmateusz
Copy link
Member Author

Hi again, I see that there are some functionalities which could help with organising the commits: git cherry-pick in particular. Let me know if it is necessary and then I'll further look into it.

@orbeckst
Copy link
Member

orbeckst commented Apr 3, 2019

Looks like a clean diff, so no need to play more with git rebase and friends.

@orbeckst
Copy link
Member

orbeckst commented Apr 3, 2019

I merged latest develop into your branch so that the docs build correctly and don't fail your build.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Primarily I have comments on the docs.

Also include an entry for CHANGELOG.

I would like to see a review by some of the original authors of SP, e.g., @alejob .

package/MDAnalysis/analysis/waterdynamics.py Show resolved Hide resolved
package/MDAnalysis/analysis/waterdynamics.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/waterdynamics.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/waterdynamics.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/waterdynamics.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/waterdynamics.py Show resolved Hide resolved
package/MDAnalysis/analysis/waterdynamics.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/waterdynamics.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/waterdynamics.py Outdated Show resolved Hide resolved
package/MDAnalysis/analysis/waterdynamics.py Show resolved Hide resolved
@orbeckst
Copy link
Member

orbeckst commented Apr 3, 2019

We generalised the code to be applicable to other situations which is why we would like to ask if we should move it out of waterdynamics.py to its own file (survivalprobability.py in analysis).

I like the idea of moving it into its own module. How about doing this as a separate PR once this one is merged? The smaller a PR the better.

@orbeckst orbeckst mentioned this pull request Apr 3, 2019
4 tasks
@orbeckst
Copy link
Member

orbeckst commented Apr 3, 2019

I broke travis #2233 so once PR #2234 is merged we need to again merge develop into this branch, sorry.

@bieniekmateusz
Copy link
Member Author

The requested changes have been introduced. I've built the documentation and it appears fine. Cheers

@alejob
Copy link
Member

alejob commented Apr 3, 2019

We generalised the code to be applicable to other situations which is why we would like to ask if we should move it out of waterdynamics.py to its own file (survivalprobability.py in analysis).

I like the idea of moving it into its own module. How about doing this as a separate PR once this one is merged? The smaller a PR the better.

It sounds good, but how could we avoid to broke the old code? There is people that has already implemented scripts with waterdynamics.survivalprobability(), so the idea is to avoid broke that code. Maybe some kind of link inside the MDA code.

@bieniekmateusz
Copy link
Member Author

So moving it to another file would be a new PR in the future.

We could say that the code is moved and simply point the user to import the new path - if someone tries to use it. Luckily the function itself and how it works has not changed so it should be a minor inconvinience for the users, hopefully.

@orbeckst
Copy link
Member

orbeckst commented Apr 3, 2019 via email

@orbeckst
Copy link
Member

orbeckst commented Apr 4, 2019

Merged develop to get the travis fix from PR #2234

@bieniekmateusz
Copy link
Member Author

I think I've corrected everything, please let me know if I missed anything.

@bieniekmateusz
Copy link
Member Author

To give you heads up, after this is merged we will shortly send you another PR with a common autocorrelation function between the waterdynamics.HydrogenBondLifetimes and waterdynamics.SurvivalProbability.

@richardjgowers noted that it would be good to formalise and extract the common autocorrelation code (PR #1995). After careful considerations with @p-j-smith we now have a separate autocorrelation function that has intermittency built into it, and being independent, is easily testable. The test cases will be well-defined in a similar manner to the ones we've used for the SurvivalProbability. We've already verified that waterdynamics.HydrogenBondLifetimes behaves exactly the same way our autocorrelation does (continuous). These changes should to a big extent simplify the module. In the case of HydrogenBondLifetimes it would mean a simple function call replacing the entire class. This will help simplifying the documentation too (PR #2181)

Work done with @p-j-smith.

@@ -15,7 +15,7 @@ The rules for this file:
------------------------------------------------------------------------------
mm/dd/yy micaela-matta, xiki-tempula, zemanj, mattwthompson, orbeckst, aliehlen,
dpadula85, jbarnoud, manuel.nuno.melo, richardjgowers, mattwthompson,
ayushsuhane, picocentauri, NinadBhat
ayushsuhane, picocentauri, NinadBhat, bieniekmateusz, p-j-smith
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are @bieniekmateusz and @p-j-smith in AUTHORS?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We, both of us. We're doing pair-programming.

@orbeckst
Copy link
Member

orbeckst commented Apr 5, 2019

@alejob if you're happy with the PR could you please also add a formal "approve" review? Thanks.

Copy link
Member

@alejob alejob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks a good extension to SP, and the new functionalities are clearly explained.

leave a region of interest for up to two consecutive frames yet be treated as being present at all frames.
The default is continuous (0).
verbose : Boolean, optional
Print the progress to the console
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this, it gives great flexibility to SP.

@orbeckst
Copy link
Member

orbeckst commented Apr 5, 2019

Thanks @alejob !

I resolved the conflict in the CHANGELOG and we'll just let CI confirm that everything is still good.

@orbeckst orbeckst self-assigned this Apr 5, 2019
@p-j-smith p-j-smith mentioned this pull request Apr 5, 2019
4 tasks
@orbeckst orbeckst merged commit 668e183 into MDAnalysis:develop Apr 6, 2019
@orbeckst
Copy link
Member

orbeckst commented Apr 6, 2019

Congratulations @bieniekmateusz and @p-j-smith , merged! Thanks for your hard work (and patience).

Thanks @alejob for reviewing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants