Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add encoded keyword for the HTTPFileSystem #1021

Merged
merged 6 commits into from Sep 29, 2022

Conversation

mraspaud
Copy link
Contributor

This PR adds the encoded keyword argument to HTTPFileSystem to allow ready-encoded urls to be passed unaltered.

The rationale behind this is that aiohttp, by using yarl, canonizes the urls before using them in .get. This works for most servers, as they are compliant with, among others, RFC3986. However, some servers, in particular the one following the OpenAPI specification, do not comply, hence making a mess for example when sending a url containing : in the path part.
To prevent the canonization of the url, aiohttp recommends here passing encoded=True to the yarl.URL object creation before passing it to aiohttp's get. More details and examples can be provided upon request.

This PR hence adds the possibility to configure the HTTPFileSystem to bypass the canonization of the urls.

@mraspaud
Copy link
Contributor Author

PS: hints on how to test this would be welcome.

@martindurant
Copy link
Member

To test, I suppose you would need the server (fsspec.tests.HTTTestHandler) to simply respond back with the full path that was requested, where this mode is activated by passing a special header keyword.

@martindurant
Copy link
Member

Do you expect to be able to make a test like I suggested?

@martindurant
Copy link
Member

Please check that the test I added makes sense to you.

@mraspaud
Copy link
Contributor Author

@martindurant thanks a lot for adding the test! I've been travelling those last weeks, so I didn't have time to look at this before today. But it looks good, I just added a “:“ character as that was the one I was struggling with. So all good for me now!

@martindurant martindurant merged commit 3163a23 into fsspec:master Sep 29, 2022
@mraspaud mraspaud deleted the feature-encoded-urls branch September 29, 2022 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants