Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encode strings before using them as URLs for Add Column by Fetching URLs - fixes #6140 #6142

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

tfmorris
Copy link
Member

@tfmorris tfmorris commented Nov 6, 2023

Fixes #6140

Changes proposed in this pull request:

  • Automatically encode URL before using it for fetch, using the correct encoding for each of the three different pieces of the URL which need encoding.

One question I have is whether we should be percent-decoding the string before feeding it to the URL encoder. This could potentially make it more compatible with existing recipes which uses .encode("url") as well as allowing the use of other sources of pre-encoded URLs, but is opening up a little bit of a can of worms.

Note also that this is based on the branch with the fix for #6137 and includes it, on the assumption that it will be reviewed and merged first.

@github-actions github-actions bot added Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements. Status: Pending Review Indicates that the issue or pull request is awaiting review by project maintainers or collaborators fetch urls About fetching URLs in a project labels Nov 6, 2023
@tfmorris tfmorris removed the Status: Pending Review Indicates that the issue or pull request is awaiting review by project maintainers or collaborators label Nov 8, 2023
@tfmorris tfmorris requested a review from wetneb December 4, 2023 18:50
Copy link
Sponsor Member

@wetneb wetneb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this.

@tfmorris tfmorris requested a review from wetneb February 2, 2024 22:54
@tfmorris
Copy link
Member Author

@wetneb ping. I addressed your comment a couple of weeks ago and this is ready for re-review.

Copy link
Sponsor Member

@wetneb wetneb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay!

final String BASE_URL = "https://example.com/";
final String TEST_URL = BASE_URL + ARABIC_PATH;
assertEquals(HttpClient.getEscapedUrl(TEST_URL), BASE_URL + UrlEscapers.urlPathSegmentEscaper().escape(ARABIC_PATH));
}
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be worth adding some test cases to demonstrate the handling of already-escaped URLs, to make sure they are not doubly escaped?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fetch urls About fetching URLs in a project Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Correctly encode URL strings before use by Fetch URL
2 participants