Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Allow plugins to pull in configuration from pyproject.toml #3415

Open
Tracked by #2671
abravalheri opened this issue Jun 26, 2022 · 20 comments
Open
Tracked by #2671

[FR] Allow plugins to pull in configuration from pyproject.toml #3415

abravalheri opened this issue Jun 26, 2022 · 20 comments
Labels
Needs Design Proposal Needs Discussion Issues where the implementation still needs to be discussed.

Comments

@abravalheri
Copy link
Contributor

abravalheri commented Jun 26, 2022

Originally this issue was discussed in #2671.

Plugin writers would like a way of pulling in configuration/metadata from pyproject.toml.
Currently the finalize_distribution_options hook runs before parse_config_files, which makes integration difficult.

@RonnyPfannschmidt, please correct me if I understood something wrong.

It would be nice if we can clarify what would be the requirements of this feature.

@abravalheri abravalheri changed the title hooks to pull in the pep-621 metadata Allow plugins to pull in configuration from pyproject.toml Jun 26, 2022
@abravalheri abravalheri changed the title Allow plugins to pull in configuration from pyproject.toml [FR] Allow plugins to pull in configuration from pyproject.toml Jun 26, 2022
@abravalheri
Copy link
Contributor Author

@RonnyPfannschmidt, I am thinking along the lines of:

class BuildHooks:
   """
   Build hooks can be added via the `setuptools.build_hooks` entry-point, and they should implement
   this interface.
   
   All public methods are reserved for future setuptools use (despite just a few hooks
   being currently defined). All public methods specified in the interface are optional,
   but when defined, they **MUST** comply with the given signature.
   """
   
   def __init__(self, dist: Distribution):
       """Plugins are initialized  inside ``finalize_options``
       (just like the ``finalize_distribution_option`` entry-point runs nowadays).
       
       Plugin writers are encouraged to save the ``dist`` argument in a private property to
       use it in successive hooks.
       """
       
   def before_config_hook(self, project_metadata: dict, plugin_config: Any):
       """
       Setuptools will pass ``plugin_config`` with the value of the first of the following
       configurations defined in ``pyproject.toml``:
       ``tool.setuptools.plugin.{{entry_point_name}}`` or ``tool.{{entry_point_name}}``.
       If the configuration is not present, the value of ``config`` will be ``None``.
       If desired, the hook implementation is responsible for validating the ``plugin_config``
       value.
       
       ``project_metadata`` corresponds to the raw metadata in the ``[project]`` table
       **if** provided in ``pyproject.toml``.
       
       Plugin writers can take advantage of ``dist`` (passed on the initialization method)
       to decide if they want to manipulate ``dist.cmdclass`` and/or add sub commands to 
       ``setuptools.command.build.build.sub_commands`` based on the value of ``plugin_config``.
       It is recommended that plugins refrain from altering the build behaviour if
       ``plugin_config`` is ``None`` (i.e. users did not opt-in).
       
       Plugin writers can also modify ``dist`` properties, but they must be aware that
       setuptools might modify these properties again after processing the ``pyproject.toml``.
       """
       
   def after_config_hook(self):
       """
       Plugin writers can use this method when they need to access attributes
       in ``dist`` after it being fully configured, but they **SHOULD** not
       modify any attribute related to project metadata
       (otherwise it could cause a direct violation of the ``dynamic``
       aspect of PEP 621 and PEP 643).
       """
       
   def before_commands_hook(self):
       """Setuptools will run this hook immediatelly before running the first command."""

@RonnyPfannschmidt
Copy link
Contributor

Standards for Metadata providing hooks would help better

@abravalheri
Copy link
Contributor Author

Do you have a suggestion @RonnyPfannschmidt (e.g. a signature)?

@RonnyPfannschmidt
Copy link
Contributor

Not yet, but if both setuptools and hatch where able to use the same api fit that it would be great

I already have the hacks in place and I'm not keen on supporting many versions of setuptools with different api sets

@abravalheri
Copy link
Contributor Author

I cannot speak for everyone in the setuptools project, but I am more than open to discuss any API/interoperability (as long as it does not involve an overhaul on the setuptools internal design -- ideally we are looking for something that can be integrated easily into the existing flow).

@abravalheri abravalheri added Needs Discussion Issues where the implementation still needs to be discussed. Needs Design Proposal labels Jun 27, 2022
@RonnyPfannschmidt
Copy link
Contributor

Ping @ofek wrt potentially shareable api for dynamic Metadata sources

@moshez
Copy link

moshez commented Jul 10, 2022

@abravalheri -- would part of fixing this ticket involve moving finalize_distribution_options to after parse_config_files()?

Or would plugin authors need to hook into new entry points to run after parse_config_files()?

Would it make sense to split this ticket into two: moving finalize_distribution_options to after parse_config_files and as a separate effort, have an API for plugins to read configuration from the toml?

@abravalheri
Copy link
Contributor Author

would part of fixing this ticket involve moving finalize_distribution_options to after parse_config_files()?

Hi @moshez, there is an "unfinished" discussion about this in the pypa/setuptools discord. I don't know how safe this operation would be because there might be code depending on the fact that finalize_distribution_options runs before parse_config_files (plugins or setuptools itself).
In my mind it would be safer to have a separated hook, but this opinion is very personal.

Would it make sense to split this ticket into two: moving finalize_distribution_options to after parse_config_files and as a separate effort, have an API for plugins to read configuration from the toml?

We can definitely do that. For the time being I am postponing this, because first I would like to listen from the plugin developers what are their list of requirements and what would be their idea of an useful API.

@moshez
Copy link

moshez commented Jul 17, 2022

Thanks @abravalheri .

As someone who has written a plugin, I can tell you that my preference would be for "finalize..." to be called after the parsing of the toml file, because I assume that anything I modify in "finalize..." does not change any more. However, if there is a new hook, I can modify my plugin to use that. Either way, the lack of any hook that is after the parsing of the toml file is limiting the abilities of a plugin. What are the next steps? I can help circulating an informal poll, but we will need to decide how long we wait for responses, and how the decision is made, before starting to write the poll: in my experience, these decisions influence the design of the poll.

Let me know if I can help in any way, or if there is any further feedback you need.

@KOLANICH
Copy link
Contributor

KOLANICH commented Aug 9, 2022

I think that there should be a way to specify mode of invocation of a plugin. There should be at least 2 modes:

a. a plugin is always invoked
b. a plugin is invoked only if pyproject.toml has a section for it. If pyproject.toml doesn't have a section, plugin's package must not be imported at all.

It is proposed to provide the possibility to specify this it via "entry_points metadata" - JSON (or maybe any other serialization format like JSON) appended to the end of entry point name after a delimiter (currently in my packages I use @).

@KOLANICH
Copy link
Contributor

KOLANICH commented Aug 9, 2022

In #2031 it was also proposed to determine which data to provide via the entry point function signature using the annotations and arguments names.

@abravalheri
Copy link
Contributor Author

Hi @KOLANICH thank you very much for the comment and references to the previous PRs/issues. I tried to go over them and get a gist of the proposal, but I confess that I might not have understood them 100%. So, please be patient with me while I try to understand 😅.

Approaches (a) and (b) sound reasonable (although we can also argue that as long as we have a way of providing the configuration from pyproject.toml to the plugin, the plugin can also do an early return if the configuration is empty).

It is proposed to provide the possibility to specify this it via "entry_points metadata" - JSON (or maybe any other serialization format like JSON) appended to the end of entry point name after a delimiter (currently in my packages I use @).

What are the pros/cons of providing metadata via entry-point specification vs embedding it into the design of the Python API? For example, we could require the entry-point to be a reference to a callable object that also provides a set of attributes that contain the metadata:

class FinalizeDistributionHook(Protocol):
     # Metadata as attr
     # We can use `getattr(entry_point, attr, None)` for backward compatibility
     opt_in: Optional[bool] = True
     delayed_execution: Optional[bool] = True

     # Config ingestion
     # We can use `getattr(entry_point, 'configure', lambda _: None)` for backward compatibility
     def configure(self, options: dict | None) -> None: ...

     # Actual implementation
     def __call__(self, dist: Distribution) -> None: ...

I believe that it would be important to discuss changes in the entry-point specification in the PyPA level, since PyPA now holds a spec on entry-point: https://packaging.python.org/en/latest/specifications/entry-points/[^1]

@abravalheri
Copy link
Contributor Author

Hi @moshez, I am terribly sorry that this slipped under my radar.

What are the next steps? I can help circulating an informal poll, but we will need to decide how long we wait for responses, and how the decision is made, before starting to write the poll: in my experience, these decisions influence the design of the poll.

Let me know if I can help in any way, or if there is any further feedback you need.

I think that we need someone to champion a design (or a small set of designs that can be voted/commented by the community in a pool as you suggested). One important part of this work is to explain which problem(s) the plugin writers are trying to solve and get the "buy-in" of the other (more experienced) maintainers of setuptools.

Once we have consensus on the design and the buy-in of the other maintainers, I will be happy to support the development of the feature as much as I can.

@moshez
Copy link

moshez commented Aug 10, 2022

Hi @abravalheri , so re step 1, explaining the problem that plugin writers can face, I tried to do a PoC in https://github.com/moshez/onedotoh/ -- this is a "parody" of a setuptools plugin: it just overrides the version to always be "1.0.0".

I have two kinds of projects in the repo: one a "classic", setuptools_example, and the other, a modern version, pyproject_example.

I checked in a script ineffective-plugin.sh, that builds both projects. The setuptools example correctly gets the version from the plugin, "1.0.0". However, the pyproject example, even though it opts into the onedotoh version, gets "0.1.2". This happens because pyproject.toml settings are read after the plugin has been called.

There are a number of possible designs to fix this, of course -- but I first want to make sure that the problem makes sense, and that you agree that it does need a solution. If there is a better way that I can implement "onedotoh" plugin to override the version of pyproject_example that I am not seeing, I'm happy to be educated -- and I'll be happy to write a patch for the setuptools plugins documentation. However, after reading carefully through the setuptools code, I don't see a better hook.

Edited to add -- if you think this issue is not the right place to discuss this problem, let me know if I should open a new issue or start a discussion on discourse.

Edited again to add: just to clarify, the final output of the shell script is:

pyproject_example/dist/:
a_pyproject-0.1.2-py3-none-any.whl

setuptools_example/dist/:
a_setup-1.0.0-py3-none-any.whl

and the desired output of the shell script would have been to have a_pyproject-1.0.0 in there.

@graingert
Copy link
Contributor

graingert commented Aug 10, 2022

My understanding of PEP 621 was that the metadata in the pyproject.toml always took precedent and that your build wrapper tool should crash if it sees a .whl that doesn't match the metadata

Eg pip or python -m build should reject the whl if it saw your version set to 0.1 in the pyproject.toml and 1 in the .whl

But it looks like that's not happening here because setuptools is silently fixing the .whl metadata after "your plugin broke it"

@abravalheri
Copy link
Contributor Author

abravalheri commented Aug 10, 2022

Thank you very much for working on this @moshez .

As pointed out by @graingert , maybe this is not the best example for the feature. Indeed PEP 621 requires setuptools to ensure the final value of version is the one given in pyproject.toml. If instead you use dynamic = ["version"], as implied in PEP 621, I guess things would work properly?

Right now setuptools can "fix" the compliance with PEP 621 because the hooks run before reading pyproject.toml. Of course, after we allow plugins to run after the configuration takes place, it will be the responsibility of the plugin to follow PEP 621.


By having a quick look at setuptools-scm, I can see that the problem they seem to be interested in solving is more on the how to read metadata side than in the how to write metadata side. setuptools-scm wants to know the distribution name, which indeed requires all the configuration steps to be finished by the time the hook runs.

@graingert
Copy link
Contributor

But it looks like that's not happening here because setuptools is silently fixing the .whl metadata after "your plugin broke it"

I actually think in this scenario it would be better if setuptools crashed rather than silently disobey a plugin or violate PEP 621. Out of interest what does build/pip do if a build backend does violate PEP621 and are there any sdists on pypi that do this?

@abravalheri
Copy link
Contributor Author

abravalheri commented Aug 10, 2022

I actually think in this scenario it would be better if setuptools crashed rather than silently disobey a plugin or violate PEP 621.

Given the existing circumstances (1. plugins might run even if the users don't opt in for using them; 2. the user is, after all, explicitly telling setuptools to use the given version), the behaviour may make sense as it is... I am happy to review a PR if someone is interested in adding this extra check, but one of the things that I would be careful is to avoid growing even more the complexity of the implementation (already too complex).

Out of interest what does build/pip do if a build backend does violate PEP621 and are there any sdists on pypi that do this?

Not entirely sure... Based on the previous discussion in #3195, I think pip does not implement any strong validation of this part of the PEP (at least not for pip wheel/classifiers field).

@KOLANICH
Copy link
Contributor

although we can also argue that as long as we have a way of providing the configuration from pyproject.toml to the plugin, the plugin can also do an early return if the configuration is empty

The pros of entry_points metadata is that they don't involve code being executed. Soketikes it is highly undesireable even to import a package if it is not actually needed:

  • an importee can import large dependencies
  • an importee can do costly initialization
  • an importee can have a viral license, like GPL, so importing it into non-GPL modules can be illegal
  • importees can be alternatives to each other, via metadata we can specify priorities and load only the most prioritized entry point instead of loading each one

For example my setuptools plugin on import discovers own backends via entry point mechanism and one bavkend discovers some CLI tools and another one (GPL-licensed, for most of packages it is not loaded, in its entry point metadata it is written it is GPLed, so for the most of packages it is not even imported) loads whole JVM (using awesome JPype lib) and some jars into it. All of this is slow and gives a delay of a second. And if anything gies wrong during plugin initialization, it breaks the build process completely.

The most of packages don't depend on the setuptools plugin and the lib used by it, so they shouldn't suffer from the slowdown and get the risk of their build process disrupted.

@abravalheri
Copy link
Contributor Author

Thank you very much @KOLANICH .
I agree that having static metadata in entry-point might offer a series of advantages.

However, pursuing this approach would require to build consensus in the community and quite possibly a change in the entry-points spec. That is why the API- oriented approach might be the path of least resistence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Design Proposal Needs Discussion Issues where the implementation still needs to be discussed.
Projects
None yet
Development

No branches or pull requests

5 participants