Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-English months in PDB headers #4449

Open
exs-gbartlett opened this issue Sep 19, 2023 · 1 comment · May be fixed by #4450
Open

Non-English months in PDB headers #4449

exs-gbartlett opened this issue Sep 19, 2023 · 1 comment · May be fixed by #4450

Comments

@exs-gbartlett
Copy link

Setup

I am reporting a problem with Biopython version, Python version, and operating
system as follows:

>>> import sys; print(sys.version)
3.8.17 | packaged by conda-forge | (default, Jun 16 2023, 07:11:34) 
[Clang 14.0.6 ]
>>> import platform; print(platform.python_implementation()); print (platform.platform())
CPython
macOS-10.16-x86_64-i386-64bit
>>> import Bio
>>> print(Bio.__version__)
1.81

(Please copy and run the above in your Python, and copy-and-paste the output)

Expected behaviour

In the PDB header there is a deposition date in xx-xxx-xx format. If this is filled out with non-english month month abbreviations (e.g. Okt for October or Ago for August), Biopython should assign the month to be '0' as it does for the xxx case

Actual behaviour

Reproduced from parse_pdb_header._format_date function:

In [3]: pdb_date = "21-Ago-23"

In [4]: month = str(all_months.index(pdb_date[3:6]))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[4], line 1
----> 1 month = str(all_months.index(pdb_date[3:6]))

ValueError: 'Ago' is not in list

Proposed solution:

We should make a check in the _format_date function that the string being checked is actually in the all_months list and if not set the month to '0'.

@peterjc
Copy link
Member

peterjc commented Sep 20, 2023

Where did this PDB file come from? I would argue it is technically an invalid file, but the current ValueError is not ideal.

This could follow the existing strict/permissive mode behaviour, and using 0 or None for the month might be acceptable.

@exs-gbartlett exs-gbartlett linked a pull request Sep 20, 2023 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants