Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace import * with explicit imports #645

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Erotemic
Copy link

@Erotemic Erotemic commented Jul 1, 2022

@ingydotnet I had originally planned to try a first PR where I added some documentation, but I re-prioritized this first. I find it very difficult to nagivate code that use import * because it's not immediately clear what is coming from where, especially if there are more than one of them. However, import * is undeniably useful when you are in the midst of writing a repo and you already know what lives where (or in other use cases where you need to mirror another module dynamically).

This is the reason I wrote mkinit, which is a library that helps autogenerate code roughly equivalent to import *. I rank the program on this repo and it doesn't seem there is any crazy dynamic logic that would mess up mkinit's static analysis. The code in the __init__.py is autogenerated as-is, I defined the __submodules__ attribute to restrict mkinit to only working with those modules that were already import *-ed and I added a __dev__ note comment that shows a user the commands that were used to do the auto-generation.

For the other files, I used mkinit to help, but it only works on __init__.py files at the moment (although, this use case does inspire me to upgrade it such that it can work it's magic to actually replace import * in files rather than just generating a specified __init__.py file). But what I did was I ran mkinit on the entire repo and told it to generate attrs for all the submodules, and then I did a manual copy/paste to overwrite things where appropriate.

Also, unless I missed some magic logic that defined a global loader variable in lib/yaml/__init__.py, then the functions add_constructor, add_path_resolver, etc... have a NameError in one of their code paths. If this is real, this does serve as evidence for the value of explicitly stating your imports. It makes it a lot easier for linters like pyflake to catch these errors (or maybe I'm just missing how that global was supposed to be populated).

@nitzmahone
Copy link
Member

I'm not philosophically opposed to replacing the import *s with an equivalent list of explicit imports for all the reasons you mention- that code is ancient and lazy, and almost certainly exposes types on the top-level module that had no business being there. That said, I'd rather not leave tool-specific bits in the code to regenerate it, especially because we'll probably almost never add to that list again (in favor of subpackages for any new API work). Not sure if your tool was only doing "used" imports or if it consulted the existing list from import * or what, but we definitely need to preserve every module type-attribute that was there before resulting from the wildcard imports, since for better or worse, the module did expose those as directly importable, so people are probably using them.

@Erotemic
Copy link
Author

Erotemic commented Jul 1, 2022

I have no issue with removing the notes about the autogeneration process from the code.

On your point about preserving the API exactly as is, the static version of mkinit does not look at what import * so normally there is a chance that it won't be exactly 1-to-1, however, the submodules it inspected were thankfully using the __all__ attribute, which restricts the behavior of import * to only use attributes listed there. In this case the static mkinit process (which also respects __all__ by default), should be using the same information and thus be 1-to-1. So I'm fairly confident that what I have is a no-op from a user perspective, but I think it's a good idea to verify that with dir, which I'll do.

On pip install of yaml 6.0 the result from dir(yaml) I get is:

orig = {'add_representer', 'FlowMappingStartToken', 'DocumentEndEvent', 'CUnsafeLoader', 'BlockSequenceStartToken', 'FlowEntryToken', 'DocumentEndToken', 'Mark', 'Dumper', 'scan', 'BlockEntryToken', 'DirectiveToken', 'Node', 'CSafeLoader', 'emit', 'resolver', 'dump_all', 'ScalarNode', 'SequenceStartEvent', 'warnings', 'load_all', 'CollectionEndEvent', 'MarkedYAMLError', 'add_path_resolver', 'CDumper', 'DocumentStartEvent', 'BlockEndToken', '__doc__', 'StreamEndEvent', 'add_multi_constructor', 'compose', 'CFullLoader', 'add_implicit_resolver', 'constructor', 'add_constructor', 'full_load', 'load', 'Event', 'CSafeDumper', '__version__', 'SafeDumper', 'add_multi_representer', 'SafeLoader', 'KeyToken', 'ValueToken', 'safe_dump_all', 'tokens', 'CollectionStartEvent', 'compose_all', 'representer', 'unsafe_load_all', 'FlowSequenceEndToken', 'reader', 'composer', 'Token', '__with_libyaml__', 'BaseLoader', 'AnchorToken', 'parse', 'nodes', 'Loader', 'serialize_all', 'CollectionNode', 'events', 'dump', 'io', 'NodeEvent', 'UnsafeLoader', '__loader__', 'unsafe_load', 'ScalarToken', '__file__', 'safe_dump', '__builtins__', 'CBaseDumper', 'AliasEvent', 'BaseDumper', 'YAMLObjectMetaclass', 'TagToken', 'DocumentStartToken', '__package__', 'emitter', 'error', 'MappingStartEvent', '__cached__', 'StreamStartEvent', 'cyaml', '__spec__', 'loader', 'ScalarEvent', 'full_load_all', 'SequenceEndEvent', 'FullLoader', 'AliasToken', 'safe_load_all', 'YAMLError', 'CLoader', 'safe_load', 'MappingNode', 'YAMLObject', 'SequenceNode', 'parser', 'StreamEndToken', '__path__', 'serializer', 'CBaseLoader', '_yaml', 'scanner', 'BlockMappingStartToken', 'StreamStartToken', 'FlowSequenceStartToken', 'MappingEndEvent', 'dumper', '__name__', 'FlowMappingEndToken', 'serialize'}

and the new version I get:

new = {'__all__', 'TagToken', '__spec__', 'StreamStartToken', 'ScalarNode', 'FlowMappingEndToken', 'StreamEndToken', 'Dumper', 'composer', 'unsafe_load_all', 'error', 'io', '__submodules__', 'full_load', '_yaml', 'add_multi_representer', 'SequenceEndEvent', 'CDumper', 'MappingStartEvent', 'scan', 'safe_dump', 'events', 'SequenceNode', 'emitter', 'CFullLoader', 'SequenceStartEvent', 'warnings', 'Loader', 'DocumentEndToken', '__doc__', 'scanner', 'BaseLoader', 'CBaseLoader', 'serialize', 'compose_all', '__package__', 'SafeLoader', '__path__', 'add_path_resolver', 'MappingNode', '__file__', 'BaseDumper', 'BlockSequenceStartToken', 'Mark', 'parse', 'serialize_all', 'serializer', 'constructor', 'safe_load_all', '__with_libyaml__', 'CLoader', 'FlowMappingStartToken', 'ValueToken', 'emit', 'CSafeDumper', 'DocumentStartToken', 'add_representer', 'Event', 'BlockEndToken', '__builtins__', 'safe_dump_all', 'StreamEndEvent', 'add_implicit_resolver', 'YAMLError', 'MarkedYAMLError', '__name__', 'FlowSequenceEndToken', 'AliasToken', 'AliasEvent', 'dump', 'UnsafeLoader', 'tokens', 'FullLoader', 'full_load_all', 'ScalarToken', 'load_all', 'compose', 'representer', 'CollectionEndEvent', 'reader', 'MappingEndEvent', 'SafeDumper', 'add_constructor', 'cyaml', 'parser', 'YAMLObject', 'loader', 'NodeEvent', 'YAMLObjectMetaclass', 'DirectiveToken', 'BlockMappingStartToken', 'StreamStartEvent', 'safe_load', '__dev__', 'unsafe_load', 'CUnsafeLoader', 'Token', 'resolver', 'AnchorToken', '__loader__', 'FlowEntryToken', '__version__', 'KeyToken', 'CSafeLoader', 'CollectionNode', 'BlockEntryToken', 'nodes', 'load', 'CollectionStartEvent', 'DocumentStartEvent', 'dumper', 'FlowSequenceStartToken', 'ScalarEvent', '__cached__', 'DocumentEndEvent', 'dump_all', 'Node', 'add_multi_constructor', 'CBaseDumper'}

And this verifies that nothing has been removed from the top level API.

In [7]: orig - new
Out[7]: set()

In [8]: new - orig
Out[8]: {'__all__', '__dev__', '__submodules__'}

Copy link
Member

@nitzmahone nitzmahone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of other minor seemingly unrelated/unnecessary changes to zap, then this looks good to go. Thanks for double-checking the equivalence (and "showing your work" 😉). With our current workflow, we don't usually merge things to a release branch until we're actually prepping a release, but with those couple changes, this one looks good to go for the next release- I'll mark it as such and we'll merge it whenever we do one (probably sometime before Python 3.11 final release, but TBD). Thanks!

lib/yaml/__init__.py Outdated Show resolved Hide resolved
lib/yaml/__init__.py Outdated Show resolved Hide resolved
@nitzmahone nitzmahone added this to In Progress in PyYAML 6.1 release planning Jul 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

None yet

2 participants