Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add box selection to support a box like selection from given selection #4324

Open
wants to merge 10 commits into
base: develop
Choose a base branch
from

Conversation

Cloudac7
Copy link

@Cloudac7 Cloudac7 commented Oct 25, 2023

Description:

It introduces a new geometry selection, as described in #4323.
A new selection language is defined in format:

box dimension(s) d1_min d1_max (d2_min d2_max) (d3_min d3_max) selection

It could pick a zone from the center of geometry from selection.

Changes made in this Pull Request:

  • Add new geometry selection: box

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Developers certificate of origin


📚 Documentation preview 📚: https://mdanalysis--4324.org.readthedocs.build/en/4324/

@pep8speaks
Copy link

pep8speaks commented Oct 25, 2023

Hello @Cloudac7! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 3114:80: E501 line too long (98 > 79 characters)
Line 3118:74: W291 trailing whitespace
Line 3119:78: W291 trailing whitespace

Line 560:80: E501 line too long (83 > 79 characters)
Line 565:80: E501 line too long (83 > 79 characters)
Line 570:80: E501 line too long (83 > 79 characters)
Line 605:80: E501 line too long (80 > 79 characters)

Line 819:80: E501 line too long (81 > 79 characters)

Comment last updated at 2024-03-25 19:34:45 UTC

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello there first time contributor! Welcome to the MDAnalysis community! We ask that all contributors abide by our Code of Conduct and that first time contributors introduce themselves on the developer mailing list so we can get to know you. You can learn more about participating here. Please also add yourself to package/AUTHORS as part of this PR.

@github-actions
Copy link

github-actions bot commented Oct 25, 2023

Linter Bot Results:

Hi @Cloudac7! Thanks for making this PR. We linted your code and found the following:

Some issues were found with the formatting of your code.

Code Location Outcome
main package ⚠️ Possible failure
testsuite ⚠️ Possible failure

Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/8425924832/job/23073093482


Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!

@codecov
Copy link

codecov bot commented Oct 25, 2023

Codecov Report

Attention: Patch coverage is 98.00000% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 93.37%. Comparing base (0582265) to head (8d7f2ce).

❗ Current head 8d7f2ce differs from pull request most recent head f72c618. Consider uploading reports for the commit f72c618 to get more accurate results

Files Patch % Lines
package/MDAnalysis/core/selection.py 98.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4324      +/-   ##
===========================================
- Coverage    93.65%   93.37%   -0.28%     
===========================================
  Files          168      170       +2     
  Lines        21215    22375    +1160     
  Branches      3908     4092     +184     
===========================================
+ Hits         19869    20893    +1024     
- Misses         888      963      +75     
- Partials       458      519      +61     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the slow reviews @Cloudac7

I've added some initial comments.

@MDAnalysis/coredevs could someone please take on the responsibility of reviewing this PR?

testsuite/MDAnalysisTests/core/test_atomselections.py Outdated Show resolved Hide resolved
@@ -14,6 +14,8 @@
import sys
import os
import datetime
sys.path.insert(0, os.path.abspath('../../..'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why this is being added?

Copy link
Author

@Cloudac7 Cloudac7 Nov 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for forgetting remove the line.
I am developing on MacOS platform and thus the compiling of documentation with original code would raise an ModuleNotFoundError: No module named 'MDAnalysis'. My practice followed the instruction of latest documentation for developing, and no solution found. After adding the line in more upper position, sphinx-build could work.

@@ -277,6 +277,23 @@ cyzone *externalRadius* *zMax* *zMin* *selection*
relative to the COG of *selection*, instead of absolute z-values
in the box.

box *dimension(s)* *d1_Max* *d1_Min* (*d2_Max* *d2_Min*) (*d3_Max* *d3_Min*) *selection*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very likely missing something, but I don't know if I fully understand how this differs from using prop - could you maybe explain the benefits of this new selection over using a succession of prop?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, prop is used to define a static zone in each axes, in my understanding. The new defined box selection do similar thing like which sphzone, cyzone or cylayer do, thus could define a zone according to another selection which could also be dynamical when updating=True.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See @Cloudac7 's description in #4323 (comment) , which looks sensible to me. If we have cyzone then we can also have box.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would make more sense to me as min to max values rather than the opposite. E.g. box z 0 8 bilayer to select a slice of water above a bilayer and box z -8 0 to get the other half.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richardjgowers I had the same thoughts but unfortunately cyzone and friends started this inverse "max min" syntax. I feel consistency across the *zone and *layer selections is more important here than purity.

(I mean I am really looking for someone to convince me we should do min max here ;-) because I know that I am going to mistype it...)

package/CHANGELOG Outdated Show resolved Hide resolved
super().__init__(parser, tokens)
self.periodic = parser.periodic
self.direction = tokens.popleft()
if self.direction not in self.combination:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we set xmin xmax etc here to None then optionally set them to something else in the else: branch below? I think this would simplify the __getattribute__ usage below

Comment on lines 618 to 622
# Triclinic version
tribox = group.universe.trajectory.ts.triclinic_dimensions
vecs -= tribox[2] * np.rint(vecs[:, 2] / tribox[2][2])[:, None]
vecs -= tribox[1] * np.rint(vecs[:, 1] / tribox[1][1])[:, None]
vecs -= tribox[0] * np.rint(vecs[:, 0] / tribox[0][0])[:, None]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't technically correct for some triclinic boxes with some shifts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'll want this function:

def minimize_vectors(vectors: npt.NDArray, box: npt.NDArray) -> npt.NDArray:

But it won't handle some, but not all, box dimensions being given (i.e. semi periodic calculation)

@@ -277,6 +277,23 @@ cyzone *externalRadius* *zMax* *zMin* *selection*
relative to the COG of *selection*, instead of absolute z-values
in the box.

box *dimension(s)* *d1_Max* *d1_Min* (*d2_Max* *d2_Min*) (*d3_Max* *d3_Min*) *selection*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would make more sense to me as min to max values rather than the opposite. E.g. box z 0 8 bilayer to select a slice of water above a bilayer and box z -8 0 to get the other half.

Comment on lines 543 to 559
combination = [
"x",
"y",
"z",
"xy",
"xz",
"yz",
"yx",
"zx",
"zy",
"xyz",
"xzy",
"yxz",
"yzx",
"zxy",
"zyx",
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
combination = [
"x",
"y",
"z",
"xy",
"xz",
"yz",
"yx",
"zx",
"zy",
"xyz",
"xzy",
"yxz",
"yzx",
"zxy",
"zyx",
]
combination = [''.join(s) for i in range(1, 4) for s in itertools.permutations("xyz", i)]

Just for fun. Probably less readable than what you wrote.

But the question is, do we really need both xy and yx? What's the use case of having all permutations as opposed to only combinations?

Suggested change
combination = [
"x",
"y",
"z",
"xy",
"xz",
"yz",
"yx",
"zx",
"zy",
"xyz",
"xzy",
"yxz",
"yzx",
"zxy",
"zyx",
]
combination = [''.join(s) for i in range(1, 4) for s in itertools.combinations("xyz", i)]

Nitpick: the variable is called "combination", so the name might have to change to "permutation" ("permutations"?) if that's the actual use case.

Copy link
Author

@Cloudac7 Cloudac7 Nov 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extensive list of all permutations of x, y, and z is designed to avoid the need to calculate the O(3!) = O(1) permutations each time a selection is executed. This has been reformatted by darker to comply with PEP8 standards. It's a minor optimization that can be overlooked, though. 🤣

Regarding the latter point, my original intention was to enhance the flexibility of the selection by making both yx and xy acceptable. However, this inadvertently added extra complexity. You're correct in suggesting that it would be better to have users write in the sequence of x -> y -> z. This approach would eliminate the need for additional processing.

Comment on lines 571 to 572
"The direction '{}' is not valid. Must be combination of {}"
"".format(self.direction, ["x", "y", "z"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"The direction '{}' is not valid. Must be combination of {}"
"".format(self.direction, ["x", "y", "z"])
f"The direction '{self.direction}' is not valid. Must be combination of {axis_map}"

doc: add doc about box selection

fix: reformat with darker and flake8
test: add error catching for box selection
fix: prevent abnormal input in direction
style: reformat CHANGELOG
commit 8d7f2ce
Author: Futaki Haduki <812556867@qq.com>
Date:   Thu Nov 9 17:23:57 2023 +0800

    Update package/MDAnalysis/core/groups.py

    Co-authored-by: Rocco Meli <r.meli@bluemail.ch>

commit cfdec5b
Author: Futaki Haduki <812556867@qq.com>
Date:   Thu Nov 9 17:23:44 2023 +0800

    Update package/MDAnalysis/core/groups.py

    Co-authored-by: Rocco Meli <r.meli@bluemail.ch>
@orbeckst
Copy link
Member

@richardjgowers I am tentatively assigning the PR to you for shepherding it through. If it's too much then please unassign yourself and ping me. Thanks.

@orbeckst
Copy link
Member

@richardjgowers could you have a look if @Cloudac7 addressed the reviewers' comments? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a geometry selection supporting selecting an zone shaped like a box
6 participants