Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow non lowercase extras #4901

Closed

Conversation

GabrielC101
Copy link
Contributor

Fixes #4617

@GabrielC101 GabrielC101 changed the title Allow non lowercase extras WIP: Allow non lowercase extras Dec 2, 2017
@GabrielC101
Copy link
Contributor Author

GabrielC101 commented Dec 2, 2017

This is a bug, but the documentation is very unclear about whether names are case sensitive.

Behavior is properly documented in setuptools: https://github.com/pypa/setuptools/blob/master/docs/pkg_resources.txt#L663

Except when it's not:
https://github.com/pypa/setuptools/blob/master/docs/setuptools.txt#L689

High priority solution: update documentation for setuptools and pip to make it clear only lowercase names should be used.

@GabrielC101
Copy link
Contributor Author

GabrielC101 commented Dec 2, 2017

Additional problem: should all names simply be converted to lowercase in pip?

Are there security issues here?

If someone created a myPackage module, and a bad actor created mypackage, the person installing myPackage would be extremely vulnerable.

Currently it would just fail. I'm not sure why there's a simple "does not exist" failure, rather than installing the lowercase version. But that seems to be a good thing, in regards to security.

Edit: for some issues an extra name will be in it's initial form, but for most it will be in it's lower case/ normalized form. Find out when pip uses the initial form. That's the functional (non-documentation) bug.

@GabrielC101
Copy link
Contributor Author

Research note: When setuptools/pip generates a wheel, requires.txt lists the extras using the initial, unsanitized, name.

Continuing research.

@GabrielC101
Copy link
Contributor Author

GabrielC101 commented Dec 3, 2017

Trying to install a package with setup arg of:
name='mytestpackage'
extras_require={'EXTRAS': [extra_dep_one, extra_dep_two]},

Able to replicate error only when installing via local devpi. The following work fine:
pip install .[EXTRAS]
pip install -e .[EXTRAS]

However,
pip install -i https://my-devpi-server.com:4040/root/dev/+simple/ mytestpackage[EXTRAS]
gives me
Ignoring <dependency_package>: markers 'extra == "EXTRAS"' don't match your environment

@pradyunsg
Copy link
Member

Hi @GabrielC101!

Just wanted to say that this is extremely useful research you're doing here. I've known for a while that extras are not consistently handled throughout pip/setuptools.

IMO, extras should be normalised at the entry point where they come in from. That should then be passed internally only as normalised extras. That means, Extra-1 and extra_1 are the same.

I don't think there's a security issue here since package names follow the same normalisation rule.

@GabrielC101
Copy link
Contributor Author

Thank you @pradyunsg . Hopefully I'll figure this thing out.

@benoit-pierre
Copy link
Member

benoit-pierre commented Dec 4, 2017

Is the goal to do case sensitive lookups for extras (potentially allowing both rest and reST as different extras for the same package), or to make sure the lookup is always case insensitive, and using both rest and REST would work if the package as the reST extra?

@pradyunsg
Copy link
Member

pradyunsg commented Dec 4, 2017 via email

@benoit-pierre
Copy link
Member

I hope too.

@GabrielC101
Copy link
Contributor Author

Case insensitive would be easiest to implement. A normalized extra is lower case. Changing that would be hard/dangerous.

@GabrielC101
Copy link
Contributor Author

GabrielC101 commented Dec 5, 2017

Note: this problem is already mentioned in a TODO in a vendored file.

# TODO: Can we normalize the name and extra name?

@GabrielC101
Copy link
Contributor Author

It seems this is an issue with a vendored file. I've introduced a pull request in packaging to address it.

I'm not sure how you update vendored files in this repo . . . should I submit a similar fix?

@pradyunsg
Copy link
Member

pradyunsg commented Dec 5, 2017 via email

@GabrielC101
Copy link
Contributor Author

GabrielC101 commented Dec 5, 2017

Research Note: The original error, Ignoring <dependency_package>: markers 'extra == "EXTRAS"' don't match your environment does not appear when all of the dependencies associated with the extra are already installed.

If the setup.py file contains extras_required={'EXTRA': ['requirementname']} and pip freeze prints out requirementname, there is no error when I run pip install packagename[EXTRA].

However, if pip freeze does not print out requirementname, the error, Ignoring <dependency_package>: markers 'extra == "EXTRAS"' don't match your environment, reappears when I attempt pip install packagename[EXTRA].

@GabrielC101
Copy link
Contributor Author

GabrielC101 commented Dec 6, 2017

Research Note: The original error, Ignoring <dependency_package>: markers 'extra == "EXTRAS"' don't match your environment does not appear when installing with the flag --no-cache-dir.

Nor does it appear when installing (without --no-cache-dir) the first time after deleting the contents of my cache. It does appear every subsequent time.

Seems safe to say this error only appears when installing from cache.

@GabrielC101
Copy link
Contributor Author

GabrielC101 commented Dec 6, 2017

Research note:

I usually build and deploy packages to devpi using the following commands:
python setup.py sdist
devpi upload

When I switch to:
python setup.py bdist_wheel --universal
devpi upload --formats bdist_wheel

I get the error regardless of whether I'm install from cache or devpi. Previously, I only got it when installing from cache.

Edit: I'm shocked by this. I thought it would be the opposite - non-wheels generate the error. I'm 90% this is caused during wheel building due to the fact that in packaging.requirements.Requirement doesn't force extra names into lower case.

@GabrielC101
Copy link
Contributor Author

The issue seems to be that packaging/pyparser consider parse the value using the Marker object in packaging.markers.

@GabrielC101
Copy link
Contributor Author

Research note: original error only shows up for extras that pip should be attempting to install.
For example, if I enter pip install package_name[EXTRA_ONE], yet setup.py contains both EXTRA_ONE and EXTRA_TWO as dicts given to extra_requires, the error warning will only show up for EXTRA_ONE.

Presumably pip checks for the presence of both extras. Yet only the right one throws an error warning.

@GabrielC101
Copy link
Contributor Author

Research note:
Tried reproducing error with different versions.
pip==8.1.1 - no error
pip==8.1.2 - no error
pip==9.0.0 - error occurs, but output different color

@GabrielC101
Copy link
Contributor Author

Seems the issue was introduced by pull request #4051 in commit 8f171cd.

@GabrielC101
Copy link
Contributor Author

GabrielC101 commented Dec 10, 2017

The problem arises when an extra is evaluated as a marker. Extras can be markers. They can also be extras. Conceptually, it's the difference between Requirement.extras and Requirment.markers['extra'].

Marker object contains an evaluate method that evaluates whether it is true. The only reason an extra will be evaluated as True is if a {"extra": "extraname"} is passed to the evaluate function.

When the marker is evaluated, it exists in it's whatever case form. It seems that there are no special parsing rules for markers with the extra value. So extra == Name is extra == Name. However pip assumes the marker name is properly parsed, and gives the environment in lower case when evaluate is called.

In order to fix this, package will have to be altered to parse markers with the "extra" value in lower case form.

@GabrielC101
Copy link
Contributor Author

GabrielC101 commented Dec 12, 2017

Note to self: When a Requirement has an extra (Requirement.extra) the extra is optional. When a Requirement has a marker, a true evaluation is mandatory.

This problem isn't directly arising from the installation from the primary package, but from the dependent that's being installed because the primary package has an extra.

So if a package is named package-name, and setup.py has extra_requires={extra-name: [extra-package-name]}, then package-name.extra == extra_name, but extra-package-name.markers['extra'] == extra-name.

@GabrielC101
Copy link
Contributor Author

GabrielC101 commented Dec 12, 2017

Problem: When resolver attempts to install a subrequirement, the parent Requirement.extras (+ "") become the Requirement.marker['extras'] for the subreq.

Pip parses extras correctly, but markers not at all.

This problem actually begins when a wheel is parsed by the DistInfoDistibution class. It parses Provides-Extra values as strings, but Requires-Dist are parsed as requirements, with the extra name as a marker, into the Requirement subclass class in pkg_resources.

Provides-Extras values are immediately normalized, however the old unnormalized version is retained and used to construct the dependency mapping - including evaluating markers. However, as soon as an dependencies are calculated, the extra name is normalized and stored as the key by which to access the partially semi-normalized pkg_resources.Requirement.

The dependency mapping at this point is "wrong". The extras are normalized, but the corresponding requirements (and their markers) are not.

When a pip user specifies an extra via command line, the extra name is properly parsed, and it is properly matched to the correct dependencies (whose names are also correctly parsed).

However, when the dependency is being installed, it checks to make sure all the markers are evaluated to True. Because the expected dependency is derived from the parent's extra, the expected extra is properly normalized. Because the markers are the fruit of pkg_resources.Requirement subclass, they are not parsed. Therefore, unless the markers.extras are already in parsed format, they don't match and fail to install.

When the initial error says markers 'extra == "reST"' don't match your environment, "extra" is not the extra from from pip install package[extra]. It's actually EXTRA from the METADATA file's Provides-Extra entry - after being normalized via safe_extra.

Provides-Extra is properly parsed so reST becomes rest. However Requires-Dist is not properly parsed, so reST stays as reST. The extra from the initial error is the the extra from Requires-Dist.

Ok . . . so how to fix this?

  • Option 1: Better parsing of Requires-Dist entries. These are parsed as Requirements - not packaging requirements, but a subclass in pkg_resources. Apparently pkg_resources doesn't like that packaging.Requirment doesn't normalize names or extras, so it has it's own subclass that does the parsing. Unfortunately, it parses names and extras, but not markers. This could be fixed in pkg_resources.

  • Option 2: Fix marker parsing in packaging. Same as option 1, but somewhere else. Seeing as the name-parsing and extra-parsing functionality was added in pkg_resources, it makes sense to parse the markers there too. But if there's interest in updating packages, marker parsing could be a part of that.

  • Option 3: Stop normalizing extra names in pip. Remove safe_extra. Cases are now sensitive. Bad.

  • Option 4: Stop evaluating markers. Bad.

Edit: What if we disable markers, but only when the markers are evaluating extras? Is there really a need for a requirement to evaluate the extra marker? The requirement is being installed because it was found in the dependency mapper via the parsed extra name key. We know that the extra should be true.

@GabrielC101
Copy link
Contributor Author

GabrielC101 commented Dec 13, 2017

I've created a version that prevents subrequisites from checking their parent's extras via markers. Markers are still checked, but no extra_requirements are provided, and the resulting error is handled out. All other markers still function.

On the one hand, an extra isn't an environment variable. One requisite might have an extra, while another doesn't. The same isn't true for OS or Python version.

On the other hand, the way markers and 'Dist-Required' are parsed really needs to improve.

It works.

Let me get some tests in place before it's reviewed.

@GabrielC101
Copy link
Contributor Author

GabrielC101 commented Dec 14, 2017

Is there any use case where an InstallRequirement.markers[0]["extra"] == extra_name should evaluate as False? Where pip has decided to install a package, but the install should be aborted because the correct extra hasn't been provided?

@pradyunsg pradyunsg added type: enhancement Improvements to functionality C: extras Handling optional dependencies labels Dec 26, 2017
news/4617.bugfix Outdated
@@ -0,0 +1 @@
Stop unnecessary extra evaluations causing from causing errors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be reworded.

try:
return self.markers.evaluate()
except UndefinedEnvironmentName:
return True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment here about why the flow is the way it is?

@GabrielC101 GabrielC101 force-pushed the allow-non-lowercase-extras branch 6 times, most recently from 89358f6 to 5a40f76 Compare December 27, 2017 17:48
Copy link
Member

@pradyunsg pradyunsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem right but I'm not comfortable putting weight behind that assertion without spending more time to identify what feels wrong; which I don't have right now.

Sorry. :\

@@ -126,3 +127,22 @@ def test_install_special_extra(script):
assert (
"Could not find a version that satisfies the requirement missing_pkg"
) in result.stderr, str(result)


@pytest.mark.skipif(sys.version_info < (3, 0),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this test dependant on Py3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

django-rest-swagger[reST]==0.3.10 requires python3. There's probably a better way to test this problem, but this works.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like having a test that:

  1. Depends on an external package that "happens" to have this issue (I describe it like that because we don't appear to have established precisely what causes the issue).
  2. Can only be tested on Python 3, even though the problem can happen on Python 2.
  3. May stop working if the external package changes.

I'd rather we tested this using an artificially-constructed example that we ship with the test suite, so we can control it better.

@pfmoore
Copy link
Member

pfmoore commented Jan 19, 2018

There's a lot of analysis here that I haven't really read through, but my instinct is that as pip (and PyPA projects in general) are trying to move away from implementation-defined behaviour and towards standards-based, the correct way forward for this change is to propose a change to the relevant standards to formally state whether extras are case-sensitive or not. Once there's a standard, agreeing any necessary changes to the tools to conform to that standard is easy. As things stand, however, it's hard to decide whether the change is "right".

@GabrielC101
Copy link
Contributor Author

GabrielC101 commented Jan 19, 2018

@pfmoore

The "extras are case insensitive" rule seems established. pip strips extras of their case all the time. Any ambiguity is in the wheel METADATA v 1.3 specification for Provides-Extra.

The reason this error is manifesting is because 99% extra names are stripped of case, but every now and then the 1% happens.

In fact, that's why it's impossible to install an extra with mixed case names (from a wheel). Before attempting to install the extra, it will be stripped of all case. Due to a weird error the dependency mapping dictionary keeps the value - but not the key - in the original case.

The purpose of this PR isn't add a feature, but to destroy a bug (#4617). Resolving this bug will require insensitive or sensitive extra names. The current behavior of "extra refuses to install no matter what name is typed in" should be changed.

try:
return self.markers.evaluate()
except UndefinedEnvironmentName:
return True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really make sense to me. It seems to just be hiding an error rather than fixing it. The comment seems to be trying to justify doing so, but "in certain situations" doesn't exactly sound like we know what's going on here :-(

@pfmoore
Copy link
Member

pfmoore commented Jan 20, 2018

OK, thanks for the clarification, although I couldn't find anything in the PEPs that said extras were case insensitive (you originally mentioned PEP 508, but I see you edited your comment).

My point here is that if extra names are case insensitive, they should be normalised (just like package names) so that they can be compared by simple equality. My preference would be for this to be done in the metadata itself, but I'd also be OK with pip normalising all extra names when it encounters them.

We've apparently already introduced bugs by incompletely fixing this issue (at least that's what I understand from the fact that #4617 was introduced by the fix for #3810) so I'd rather that we properly fix it once and for all this time.

@pradyunsg
Copy link
Member

I'd also be OK with pip normalising all extra names when it encounters them.

+1

I'd go so far as to say that the right thing to do here is to always normalise the extras at the edge, when pip loads them from the metadata. That way when later potential future build backends spew out non-normalised data, pip still does the right thing.

The current logic to handle extras during resolution (in the master) ain't perfect but I'd rather have this PR do better normalisation instead of touching that logic (since the later seems unnecessary).

@BrownTruck
Copy link
Contributor

Hello!

I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will eligible for code review and hopefully merging!

@BrownTruck BrownTruck added the needs rebase or merge PR has conflicts with current master label Mar 2, 2018
@benoit-pierre
Copy link
Member

I don't think this can/should be fixed in pip, see #4617 (comment).

@pradyunsg
Copy link
Member

Closing this since it's bitrotten. Please file a new PR if this work gets updated.

@pradyunsg pradyunsg closed this May 23, 2019
@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Jun 22, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Jun 22, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation C: extras Handling optional dependencies needs rebase or merge PR has conflicts with current master type: enhancement Improvements to functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

extras_require doesn't work unless extras are all-lowercase
5 participants