Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ETag support #5298

Merged
merged 9 commits into from Feb 14, 2021
Merged

Implement ETag support #5298

merged 9 commits into from Feb 14, 2021

Conversation

greshilov
Copy link
Contributor

@greshilov greshilov commented Nov 28, 2020

What do these changes do?

This PR adds etag property to the response object and if_match, if_none_match properties to the request object. Also it implements ETag support in static routes and fixes few bugs found along the way.

Despite the low priority of static files handling for aiohttp, I believe ETag and related headers has a wide application beyond this purpose.

Are there changes in behavior for the user?

Nope, no general changes in server behaviour.

Related issue number

Fixes #4594

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.
  • Add a new news fragment into the CHANGES folder
    • name it <issue_id>.<type> for example (588.bugfix)
    • if you don't have an issue_id change it to the pr id after creating the pr
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files."

This change is Reviewable

@codecov
Copy link

codecov bot commented Nov 28, 2020

Codecov Report

Merging #5298 (699e959) into master (742a8b6) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5298      +/-   ##
==========================================
+ Coverage   97.17%   97.18%   +0.01%     
==========================================
  Files          41       41              
  Lines        8768     8849      +81     
  Branches     1404     1421      +17     
==========================================
+ Hits         8520     8600      +80     
- Misses        130      131       +1     
  Partials      118      118              
Flag Coverage Δ
unit 97.07% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiohttp/__init__.py 100.00% <100.00%> (ø)
aiohttp/helpers.py 96.77% <100.00%> (+0.09%) ⬆️
aiohttp/web_fileresponse.py 98.62% <100.00%> (-1.38%) ⬇️
aiohttp/web_request.py 95.92% <100.00%> (+0.19%) ⬆️
aiohttp/web_response.py 98.21% <100.00%> (+0.11%) ⬆️
aiohttp/http_parser.py 97.45% <0.00%> (+0.21%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 742a8b6...699e959. Read the comment docs.

aiohttp/helpers.py Outdated Show resolved Hide resolved
aiohttp/helpers.py Outdated Show resolved Hide resolved
aiohttp/helpers.py Outdated Show resolved Hide resolved
value=ETAG_ANY,
)
else:
for match in LIST_QUOTED_ETAG_RE.finditer(etag_header):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It accepts also garbage like "1",spam"2".

Copy link
Contributor Author

@greshilov greshilov Dec 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right.

To solve this problem I rejected the LIST_QUOTED_ETAG_RE approach, and switched to split + re.fullmatch.
Now only "1" would be matched from proposed garbage string. I've checked nginx behaviour and it's consistent with suggested solution. nginx matches etag values from header until first invalid is found. New approach is also 10x faster than old.

Main disadvantage is that memory consumption will probably increase though, but I don't think that really long If-Match, If-None-Match headers are so widespreaded.

aiohttp/web_request.py Outdated Show resolved Hide resolved
aiohttp/helpers.py Outdated Show resolved Hide resolved
aiohttp/web_request.py Outdated Show resolved Hide resolved
if etag_header == ETAG_ANY:
yield ETag(
is_weak=False,
value=ETAG_ANY,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some inconsistency here. If ETag is "*", the value attribute will contain quotes. In all other cases they will be stripped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite. * is a kind of special value, which is used without quotes. So if someone somehow will use "*" instead of *, it will be considered as regular etag.

Unfortunatly ETag(value='*') produced from "*" is indistinguishable from one produced from *.
So, for example:

"*", "tag", "other-tag" -> (ETag(value='*'), ETag(value='other-tag'))
* -> (ETag(value='*'), )

Maybe it's better to create separate type ETAG_ANY and change Optional[Tuple[ETag, ...]] to Optional[Union[ETAG_ANY, Tuple[ETag, ...]]]

"*", "tag", "other-tag" -> (ETag(value='*'), ETag(value='other-tag'))
* -> ETAG_ANY

@asvetlov what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right.

aiohttp/web_response.py Outdated Show resolved Hide resolved
aiohttp/web_response.py Show resolved Hide resolved
aiohttp/web_response.py Outdated Show resolved Hide resolved
@asvetlov
Copy link
Member

asvetlov commented Dec 5, 2020

@serhiy-storchaka thank you very much for the review!

@greshilov please address comments, they are very valuable.

@greshilov greshilov force-pushed the pr-4594 branch 3 times, most recently from 0a01987 to b66aad1 Compare December 5, 2020 17:12
@greshilov
Copy link
Contributor Author

greshilov commented Dec 5, 2020

@serhiy-storchaka, @asvetlov thanks for the review! I appreciate your help.

I marked conversations as resolved where I used proposed solution without any changes, sorry if I shouldn't have done it by myself.

Comment on lines 41 to 45
from .helpers import (
BasicAuth as BasicAuth,
ChainMapProxy as ChainMapProxy,
ETag as ETag,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not simply from .helpers import BasicAuth, ChainMapProxy, ETag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's a kind of outer interface protection measures. Andrew can tell for sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we have no policy for such cases.
Usually, we use as only for importing under a different name if there is a conflict.

Please feel free to drop as.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if etag_header == ETAG_ANY:
yield ETag(
is_weak=False,
value=ETAG_ANY,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right.

value=ETAG_ANY,
)
else:
for raw_etag in etag_header.split(","):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if ETag contains comma? "1,2"

Actually the approach with LIST_QUOTED_ETAG_RE was not bad. You need just add |(.) at the end of the regexp. That group will be matched only if there is a garbage, so you can check it and raise an exception.

LIST_QUOTED_ETAG_RE = re.compile(fr"{_QUOTED_ETAG}(?:\s*,\s*|$)|(.)")

Copy link
Contributor Author

@greshilov greshilov Dec 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, completely forgot about this case while thinking about optimizations. Tests were updated to reflect this situation.

Thanks for the smart solution!

aiohttp/__init__.py Outdated Show resolved Hide resolved
Comment on lines 41 to 45
from .helpers import (
BasicAuth,
ChainMapProxy,
ETag,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from .helpers import (
BasicAuth,
ChainMapProxy,
ETag,
)
from .helpers import BasicAuth, ChainMapProxy, ETag

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@greshilov greshilov force-pushed the pr-4594 branch 2 times, most recently from c4b70ef to 5f0ca5e Compare December 9, 2020 23:21
@greshilov
Copy link
Contributor Author

greshilov commented Dec 9, 2020

I also have to mention, that according to rfc7232 If-Range header may contain ETag value. Current PR ignores that fact for now, because implementing ETag support for the If-Range header is not backward compatible, and should be separated to a different PR.

@asvetlov
Copy link
Member

I also have to mention, that according to rfc7232 If-Range header may contain ETag value. Current PR ignores that fact for now, because implementing ETag support for the If-Range header is not backward compatible, and should be separated to a different PR.

Agree

Copy link
Member

@asvetlov asvetlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Sorry for delay in the review

@asvetlov
Copy link
Member

@greshilov please merge.

Please edit the commit message first: press "Squash and merge" -> edit the text to cleanup meaningless records about intermediate commits -> press "Confirm squash and merge".

@greshilov
Copy link
Contributor Author

greshilov commented Dec 17, 2020

@greshilov please merge.

Please edit the commit message first: press "Squash and merge" -> edit the text to cleanup meaningless records about intermediate commits -> press "Confirm squash and merge".

Unfortunately, github doesn't show me this button. It says only users with write access can do this.

But I can squash commits manually before merge, if needed!

asvetlov and others added 5 commits January 31, 2021 21:34
@greshilov
Copy link
Contributor Author

greshilov commented Jan 31, 2021

@webknjaz, @Dreamsorcerer can you help me with squash/merging this? Andrew approved this PR but he has no spare time now.

I'm also planning to do a backport to 3.8 this week.

@@ -38,7 +38,7 @@
)
from .cookiejar import CookieJar as CookieJar, DummyCookieJar as DummyCookieJar
from .formdata import FormData as FormData
from .helpers import BasicAuth as BasicAuth, ChainMapProxy as ChainMapProxy
from .helpers import BasicAuth, ChainMapProxy, ETag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the imports changed here?

I'm assuming the style in this file follows the no implicit imports approach from Mypy:
https://mypy.readthedocs.io/en/stable/command_line.html#cmdoption-mypy-no-implicit-reexport

Although a little odd that it's not enforced in the Mypy tests if that is the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those classes are already presented in the __all__ variable.
I agree about inconsistency with mypy style flags.

This change was previously discussed, I'll tag you there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great example of why style/formatting changes should be separated from the functional ones: a tiny disagreement can block a whole lot of legitimate changes, plus this makes it harder to notice important behavior changes.

I'll just leave this here: https://mtlynch.io/code-review-love/.

Comment on lines +112 to +113
else:
return any(etag.value == etag_value for etag in etags if not etag.is_weak)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, this could be less nested:

Suggested change
else:
return any(etag.value == etag_value for etag in etags if not etag.is_weak)
return any(etag.value == etag_value for etag in etags if not etag.is_weak)

Comment on lines +120 to +121
self.etag = etag_value # type: ignore
self.last_modified = last_modified # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was agreed in other PRs to add specific error codes in brackets.

@@ -29,6 +29,7 @@ Dict
Discord
Django
Dup
ETag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm almost sure that the spellchecker is case-insensitive.



@pytest.mark.parametrize(
"header,header_attr",
Copy link
Member

@webknjaz webknjaz Feb 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR it's usually better readable (though lesser known) to use iterables in parameterize:

Suggested change
"header,header_attr",
("header", "header_attr"),

)
def test_etag_invalid_value_set(invalid_value) -> None:
resp = StreamResponse()
with pytest.raises(ValueError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best practice is to always include , match=r'...' argument demonstrating the intended exception reason more precisely.

@webknjaz
Copy link
Member

@greshilov I didn't do a thorough review but commented with a few generic advices in several places. None of them are critical and since Andrew has already approved the patch, I'm going to merge it now. Feel free to send improvements in follow-up PRs.

@webknjaz webknjaz merged commit ced553f into aio-libs:master Feb 14, 2021
@webknjaz
Copy link
Member

@greshilov waiting for the backport to 3.8 now.

@greshilov
Copy link
Contributor Author

@webknjaz sure, thank you!

greshilov added a commit to greshilov/aiohttp that referenced this pull request Mar 23, 2021
greshilov added a commit to greshilov/aiohttp that referenced this pull request Mar 23, 2021
This change adds an `etag` property to the response object and
`if_match`, `if_none_match` properties to the request object.
Also, it implements ETag support in static routes and fixes a
few bugs found along the way.

Refs:
* https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26
* https://tools.ietf.org/html/rfc7232#section-2.3
* https://tools.ietf.org/html/rfc7232#section-6

PR aio-libs#5298 by @greshilov
Resolves aio-libs#4594

Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>
Co-Authored-By: Andrew Svetlov <andrew.svetlov@gmail.com>
greshilov added a commit to greshilov/aiohttp that referenced this pull request Mar 24, 2021
webknjaz pushed a commit that referenced this pull request Mar 24, 2021
webknjaz pushed a commit that referenced this pull request Mar 24, 2021
This change adds an `etag` property to the response object and
`if_match`, `if_none_match` properties to the request object.
Also, it implements ETag support in static routes and fixes a
few bugs found along the way.

Refs:
* https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26
* https://tools.ietf.org/html/rfc7232#section-2.3
* https://tools.ietf.org/html/rfc7232#section-6

PR #5298 by @greshilov
Resolves #4594

Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>
Co-Authored-By: Andrew Svetlov <andrew.svetlov@gmail.com>

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Andrew Svetlov <andrew.svetlov@gmail.com>
commonism pushed a commit to commonism/aiohttp that referenced this pull request Apr 27, 2021
This change adds an `etag` property to the response object and
`if_match`, `if_none_match` properties to the request object.
Also, it implements ETag support in static routes and fixes a
few bugs found along the way.

Refs:
* https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26
* https://tools.ietf.org/html/rfc7232#section-2.3
* https://tools.ietf.org/html/rfc7232#section-6

PR aio-libs#5298 by @greshilov
Resolves aio-libs#4594

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Andrew Svetlov <andrew.svetlov@gmail.com>
commonism pushed a commit to commonism/aiohttp that referenced this pull request Apr 27, 2021
commonism pushed a commit to commonism/aiohttp that referenced this pull request Apr 27, 2021
This change adds an `etag` property to the response object and
`if_match`, `if_none_match` properties to the request object.
Also, it implements ETag support in static routes and fixes a
few bugs found along the way.

Refs:
* https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26
* https://tools.ietf.org/html/rfc7232#section-2.3
* https://tools.ietf.org/html/rfc7232#section-6

PR aio-libs#5298 by @greshilov
Resolves aio-libs#4594

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Andrew Svetlov <andrew.svetlov@gmail.com>
commonism pushed a commit to commonism/aiohttp that referenced this pull request Apr 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:chronographer:provided There is a change note present in this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Etag support
5 participants