Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function amz_cano_querystring() does not work correctly. #45

Open
napat1412 opened this issue Jan 20, 2021 · 2 comments
Open

Function amz_cano_querystring() does not work correctly. #45

napat1412 opened this issue Jan 20, 2021 · 2 comments

Comments

@napat1412
Copy link

I found problem about oder of canonicail querystring.
It is not sorted correctly

>>> from requests_aws4auth import AWS4Auth
>>> qs = 'key=&key-type=s3&format=json'
>>> AWS4Auth.amz_cano_querystring(qs)
'format=json&key-type=s3&key='

Expected behavior

>>> AWS4Auth.amz_cano_querystring(qs)
'format=json&key=&key-type=s3'

Querystring keys consist of format, key-type and key.
After sorted, querystring keys should be format, key, key-type.
(key must sorted before key-type).
I think you need to rewrite the sort method.

@willieconrad
Copy link

If it helps, I resolved this issue by using the following extension. I can probably make a PR with these changes some time soon ... I tested this pretty thoroughly with all kinds of crazy query string characters, but the more eyes on it the better.

class FixedAWS4Auth(AWS4Auth):
    """The current implementation of AWS4Auth is badly broken.

    When constructing the canonical query string, the first thing it
    does is decode the entire URL.  This is wrong right out of the gate.
    Consider, for example, that someone needs to send a query parameter
    that contains a value of '=' or '&'.  The properly encoded query string
    will come in to amz_cano_querystring as

        a=%26%3D

    However, as soon as it is decoded, we'll get a=&=, which will completely
    mess up the parsing and give 'a' a blank value.  The correct solution is to
    use parse_qs, which properly parses and decodes the query string.

    There are also some issues with how '+' is interpreted.  Sigv4 stipulates
    that spaces must always be encoded as %20, and that a '+' must be encoded
    as %2B, so the canonical query string should never end up with '+'
    character.  The current implementation is not guaranteeing this, as
    it is using urllib.parse.quote, which encodes spaces as '+' by default.
    """

    @staticmethod
    def amz_cano_querystring(qs):
        """
        Parse and format querystring as per AWS4 auth requirements.

        Perform percent quoting as needed.

        qs -- querystring

        """
        # If Python 2, switch to working entirely in str
        # as urlencode() has problems with Unicode
        if PY2:
            qs = qs.encode('utf-8')

        # URL is already encoded at this point, parse to get individual params.
        # This will interpret '+' as a space, which is not configurable in Python 2.
        # This is fine, since requests does the same thing.

        # We must supply keep_blank_values as true here, or else they will be
        # omitted entirely. Sigv4 expects blank values to appear in the canonical
        # query string as 'var='.
        params = parse_qs(qs, keep_blank_values=True)
        params = sorted(list(params.items()), key=lambda item: item[0])

        # Multi values are handled by doseq, but they need to be lex sorted within themselves.
        # This is not explicit in the sigv4 doc, but determined via experimentation.
        # doseq will preserve the order we set here.
        params = [(k, list(sorted(v))) for k, v in params]
        qs = urlencode(params, doseq=True)

        # Sigv4 requires we never see '+' in the query string.  Spaces must
        # be %20, and an actual '+' must be encoded as '%2B'.  urlencode can be
        # configured to quote without using '+', but only in Python 3.  This makes
        # it a bit more explicit and loosens the dependency on Python version.
        qs = qs.replace('+', '%20')

        if PY2:
            qs = unicode(qs)
        return qs

@tedder
Copy link
Owner

tedder commented Mar 9, 2022

feel free to PR it 👍 , it's okay to drop py2 support too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants