-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/add bgzf typing #4654
base: master
Are you sure you want to change the base?
Feat/add bgzf typing #4654
Conversation
improvement: make trivial runtime changes
Some of these style failures conflict with black's defaults. I'll just disable the codes for this file for now. |
else: | ||
crc = struct.pack("<I", crc) | ||
crc = struct.pack("<I", checksum) | ||
if expected_crc != crc: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You still have some crc
usage despite changing the varible name above...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is deliberate. Static type checkers won't like crc
being of two different types:
crc
is first inferred to be anint
from the linecrc = zlib.crc32(data)
- Type checkers will subsequently complain on the line
crc = struct.pack(...)
, becausestruct.pack(...)
returns abytes
object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I've hit that kind of issue myself - using a new variable name does seem to be clearest. I may have not read this change carefully enough.
self._block_raw_length = None | ||
self._buffers: dict[int, tuple[_BufferT, int]] = {} | ||
# Is there a safe default for a non-`None` integer? | ||
self._block_start_offset: int = None # type: ignore[assignment] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional[int] perhaps?
We can't used zero here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional[int]
is the same as int | None
, and doesn't play nicely with the rest of the class body. Essentially, if you set _block_start_offset: Optional[int]
, you'll have to do an assertion that it's not None
whenever it's used in the context of an integer, in the rest of the class body; otherwise the type-checker will complain.
The assertions are visual noise, however - I'd prefer not to add them. I'll leave it like this for now, because the next line (self._load_block
) is guaranteed to set it to an integer that's not None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have had to introduce occasional assert isintance(...)
lines in my own project. As you say, largely visual noise :(
_Self = tp.TypeVar("_Self") | ||
# fmt: off | ||
# Writing and updating modes | ||
_ReadingBinaryMode: TypeAlias = tp.Literal["rb", "br"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, esentially an enum - does this work nicely in a compatible code editor for suggesting values? Until now I'd have just annotated this as str
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some language server protocols might work well with typing.Literal
- I use PyCharm and unfortunately it doesn't seem to do so :(
# fmt: on | ||
def __init__( | ||
self, | ||
filename: _typeshed.StrPath | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At what point can we use XXX | None
rather than Optional[XXX]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same explanation as #4654 (comment)
def open(filename, mode="rb"): | ||
# fmt: off | ||
@tp.overload | ||
def open(filename: _typeshed.StrPath, mode: _ReadingBinaryMode = "rb") -> BgzfReader[bytes]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it these lines triggering D103 missing docstring in public functions
? That feels like a false positive - indeed PyCQA/pydocstyle#419 agreed.
However, pydocstyle is now deprecated. I will look at droping this from our flake8 configuration and enabling it via ruff as suggested.
I hereby agree to dual licence this and any previous contributions under both
the Biopython License Agreement AND the BSD 3-Clause License.
I have read the
CONTRIBUTING.rst
file, have runpre-commit
locally, and understand that continuous integration checks will be used to
confirm the Biopython unit tests and style checks pass with these changes.
I have added my name to the alphabetical contributors listings in the files
NEWS.rst
andCONTRIB.rst
as part of this pull request, am listedalready, or do not wish to be listed. (This acknowledgement is optional.)
@peterjc As discussed in #4651 (comment), this is a conversion of
Bio.bgzf
to--strict
static typing. Setting this as a draft PR for initial feedback.Part of this required making some runtime changes, which are done in the following commits:
I believe these taken together are fully compatible with the usage of the module as given by the specifications, but this needs thorough review.