Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: --include-private issue (#16808) #16822

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

nautics889
Copy link
Contributor

@nautics889 nautics889 commented Jan 26, 2024

  • (fix): when --include-private is not passed stubgen omits packages and modules which names start with "_" and do not end with "__"

This commit aims to close #16808.


Let me introduce you to my struggling.
At first I decided that this happens because of is_private_name(). Indeed, when stubgen iterates over nodes via visitor and calls is_private_name() it checks only node's (function's in our case) name:

stubutil.py

    def is_private_name(self, name: str, fullname: str | None = None) -> bool:

        ...

        if name == "_":
            return False
        if not name.startswith("_"):
            return False

        ...

        return True

As you can see from the above, is_private_name() checks only name without considering the fullname.

However, if we modify it and make it check fqn, the problem would still persist — stubfiles would be created and there would be nothing (empty files).

Then, to make stubgen omit those packages/modules entirely, I decided to incorporate this logic into generate_stub_for_py_module().

And it worked, checked manually on structure like the below.

(mypy) PS .../foobar$ tree /f /a
.
|   __init__.py
|
+---subpackage
|       open_module.py
|       _internal_use_module.py
|       __init__.py
|
\---_internal_use_subpackage
        whatever_mod.py
        __init__.py

The output:

(mypy) PS ...\out> tree /f /a
C:.
\---foobar
    |   __init__.pyi
    |
    \---subpackage
            open_module.pyi
            __init__.pyi

Seems to be OK.

* (fix): when --include-private is not passed stubgen omits packages and
  modules which names start with "_" and do not end with "__"

This comment has been minimized.

nautics889 and others added 2 commits January 26, 2024 04:30
* (chore): add missing type hints
* (docs): add docsting

This comment has been minimized.

@nautics889 nautics889 marked this pull request as ready for review January 26, 2024 10:30

This comment has been minimized.

This comment has been minimized.

@hamdanal
Copy link
Contributor

hamdanal commented Feb 4, 2024

Thank you for working on this.
Wouldn't this change make stubgen always skip private modules? Sometimes functions and classes defined in private modules are either re-exported or used in public modules of the code base. In these cases they should not be excluded from the stubs. I don't think stubgen is currently powerful enough to do this cross-file analysis of where symbols are used throughout the code base to determine which of them are really private. I think unfortunately the best we can do now is include everything and manually delete unused files later.
(disclaimer: I am not a maintainer of mypy, just a regular contributor to stubgen)

* (fix): add checking if --include-private passed
@nautics889
Copy link
Contributor Author

@hamdanal

Thank you for the feedback.
True, private module should be included in some cases. This is precisely what the --include-private flag is for. There definitely must be if not include_private condition check before the early return.
Working on it...

Copy link
Contributor

github-actions bot commented Feb 5, 2024

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@hamdanal
Copy link
Contributor

hamdanal commented Feb 6, 2024

I think --include-private is meant for private symbols like classes and functions, not modules (at least this is how it works currently).
Suppose you have a public module in a package, let’s call it public.py and a sibling private module _private.py. It is very common to have something in public.py that does something like:

from ._private import Base

class Derived(Base): …

If you don’t generate _private.pyi, the stubs will be invalid. This is a very common pattern in my experience that I think should not be ignored by stubgen, especially not by default. One solution here is to have a separate command line flag for private modules instead (I don’t personally think it will be useful but if you have a convincing argument for it maybe a maintainer will accept it).

@nautics889
Copy link
Contributor Author

@hamdanal, thank you for the detailed explanation. Indeed, avoiding modules
with leading underscores can result in the incompleteness of a stub file.
Moreover, there are many such modules in the typeshed repository, such as _policybase.pyi in the email package from the standard library, _base.pyi in concurrent (btw the exact to example you've mentioned), and so on.

Tbh, so far, I don't have strong arguments for implementing a flag, like you said, --ignore-private-modules or something that would make stubgen omit those modules. Perhaps only if a user has numerous such _module_name.py modules, which are not preferred to be present in stubfiles yet, but it doesn't sound like a real use-case for me.


Since issue #16808 has been opened, shouldn't we mention this in the documentation somewhere?
I mean, that stubgen generates .pyi for all modules including those with leading underscore.

@nschloe
Copy link

nschloe commented Feb 14, 2024

OP of #16808 here.

@hamdanal points out correctly that any object in a Python module may be available via multiple import paths, some of which may contain leading underscored and some of which may not. Some may consider it bad style to expose objects in two such paths, but it does indeed happen.

I think we can agree that those methods are to be considered public, just like objects which are solely available via leading-underscored paths are to be considered private.

I don't think we should differentiate between a private module and a private function. Two flags --include-private and --ignore-private-modules would probably be confusing.

Instead, a proper solution is to check in is_private_* via which paths it's available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

stubgen builds stubs for private modules
3 participants