-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non english months in pdb headers 4449 #4450
base: master
Are you sure you want to change the base?
Non english months in pdb headers 4449 #4450
Conversation
Currently this silently ignores the bad month in a PDB file, using zero instead. Unless our PDB team think otherwise, I would prefer this to consider the permissive/strict mode, and adjust the behavior accordingly. CC @JoaoRodrigues @etal @jgreener64 (not an exhaustive list) |
I don't have strong opinions here. On the one hand I don't think returning a zero will cause problems in user code. On the other hand it does say at https://www.wwpdb.org/documentation/file-format-content/format33/sect2.html#HEADER that
so I would guess that all PDB entries have a legitimate date (I haven't checked). |
I strongly suspect the test case here is from a third party tool exporting in PDB format, not a file from an official repository/mirror. |
The test case is from a PDB file generated by a third party tool and is not a public repository file. Does biopython only support PDB files that have been deposited at wwPDB? The month 'xxx' is tolerated in the code without strict/permissive checks. |
We try to parse "unofficial" files, but where they break the specification it is reasonable to give a warning or error. In this case I personally would not want this to be silently parsed in strict mode. |
Updated with warning and permissive flag carried through to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A minor improvement aside (warning message and class), this looks good to me.
f"Non-standard month in PDB header: {month_name}." | ||
) from None | ||
|
||
warnings.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still object against warnings, as by default they get printed only once in one Python session.
I hereby agree to dual licence this and any previous contributions under both
the Biopython License Agreement AND the BSD 3-Clause License.
I have read the
CONTRIBUTING.rst
file, have runpre-commit
locally, and understand that continuous integration checks will be used to
confirm the Biopython unit tests and style checks pass with these changes.
[] I have added my name to the alphabetical contributors listings in the files
NEWS.rst
andCONTRIB.rst
as part of this pull request, am listedalready, or do not wish to be listed. (This acknowledgement is optional.)
Closes #4449
Non-recognized months in pdb header no longer raise ValueError but return '0', the same behaviour as a month of 'xxx'. Unit test included.