-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate the normalized hostname #158
Conversation
@@ -564,12 +565,6 @@ def leading_and_trailing_whitespace | |||
it { is_expected.to eq(expected) } | |||
end | |||
|
|||
context "oddly enough, does not alter URLs with consecutive dots" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hehe, the quirk with twingly-url
is that it was intentionally meant to mimic the behavior of the good ol' .NET implementation, the change here seems to make the two implementation diverge as I gather, not sure it matters anymore though 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, but I can't see the benefit with that really :) Shouldn't really be possible to collect anything from sites with URLs like http://www..twingly..com/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just mentioning that normalization of links is/used to be a thing, in those cases it didn't necessarily mean that they'd have to be "visitable".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True... but all things that are normalised should start out as a visitable URL? I suspect we constructed this URL by hand during development, it didn't come from the real world
e093ccc
to
1522f47
Compare
@@ -103,11 +108,27 @@ def try_addressable_normalize(addressable_uri) | |||
raise | |||
end | |||
|
|||
def valid_hostname?(hostname) | |||
# No need to check the TLD, the public suffix list does that | |||
labels = hostname.split(DOT)[0...-1].map(&:to_s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need the 0
for JRuby 9.3: https://github.com/twingly/twingly-url/actions/runs/3192326669/jobs/5209668938#step:4:30
This is ready now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed one tiny thing, then this LGTM!
EDIT: Didn't include the comment in this approve and now I can't delete it :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed one tiny thing, then this LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
According to the "preferred format" used by DNS. See https://en.wikipedia.org/wiki/Domain_Name_System#Domain_name_syntax,_internationalization Moves one invalid URL to the set of invalid URLs (if you enter http://www..twingly..com/ in the address bar in Chrome, it does a search, doesn't try to visit any site).
1522f47
to
ae2d264
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are now invalid after twingly#158. I think we can close twingly#74. Close twingly#74
Regression from twingly#158
According to the "preferred format" used by DNS.
See https://en.wikipedia.org/wiki/Domain_Name_System#Domain_name_syntax,_internationalization
Moves one valid URL to the set of invalid URLs (if you enter http://www..twingly..com/ in the address bar in Chrome, it does a search, doesn't try to visit any site).
Close #62