Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix weblinks #4288

Merged
merged 16 commits into from Dec 19, 2022
Merged

Fix weblinks #4288

merged 16 commits into from Dec 19, 2022

Conversation

jerch
Copy link
Member

@jerch jerch commented Dec 4, 2022

This PR tries to tackle several issues regarding weblinks:

  • fix underline offsets
  • allow more broad url scheme (unicode, with name and password portion)
  • use URL() to validate instead of bloated regexp

TODO:

Closes #4294.
Closes #3539.
Closes #4247.
Closes #3721.
Closes #2887.

@jerch
Copy link
Member Author

jerch commented Dec 5, 2022

@Tyriar Do you happen to have an url test battery in vscode (some list of naughty url constellations), so we could check for more edge cases or things, that should explicitly match or not match as url?

src/browser/Linkifier2.ts Outdated Show resolved Hide resolved
@jerch
Copy link
Member Author

jerch commented Dec 7, 2022

FIXME: Also exclude any sort of brackets from url scheme: <>()[]{}

@jerch
Copy link
Member Author

jerch commented Dec 8, 2022

@Tyriar Imho this is ready for review.

Note that I changed the way how urls are matched - I removed the old explicit positive matching with an exclude matching to allow for newer unicode based urls (supported by all major browsers). Ofc with exclude matching the chance is higher to wrongly match too much, but thats something we can still adjust later on.

The WebLinkProvider._mapStrIdx method might be code we want to move to Buffer later on (maybe replacing the partially broken Buffer.stringIndexToBufferIndex), as we need that functionality in several places.

@jerch
Copy link
Member Author

jerch commented Dec 12, 2022

@Tyriar Thx to gh suggestions I just found out about https://github.com/microsoft/vscode-uri

Could we just use that as well in the weblinks addon? I guess that it would simplify alot and avoid lengthy discussions in follow-up issues about treating this or that as valid/invalid url?

@Tyriar
Copy link
Member

Tyriar commented Dec 16, 2022

vscode-uri is good, but I think it also has some quirks like always encoding the URL when sometimes it's not desirable (eg. encoding [ and ]). Could try that out to see how much we get from it? It's an addon so I'm not too concerned about a dependency, especially if it's from our team.

bin/test_weblinks.sh Outdated Show resolved Hide resolved
addons/xterm-addon-web-links/test/WebLinksAddon.api.ts Outdated Show resolved Hide resolved
// - final interpunction like ,.!?
// - any sort of brackets <>()[]{} (not spec conform, but often used to enclose urls)
// - unsafe chars from rfc1738: {}|\^~[]`
const strictUrlRegex = /https?:[/]{2}[^\s"'!*(){}|\\\^<>`]*[^\s"':,.!?{}|\\\^~\[\]`()<>]/;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing people will complain about regressions here, interesting learning about rfc1738 though. Should we be more explicit about the intended scope of the addon in the readme? I've always viewed it as a decent basic URL detector, but it's not meant to be feature complete and handle a bunch of edge cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing people will complain about regressions here

Yes thats the downside of turning the matcher logic upside-down, but I did not find another way while allowing these unicode uris to match. Will see if I get less "overmatching" from the vscode-uri package.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how to go about the risk of overmatching. I think the new matcher is way more capable than the old one, so I'd suggest to go with new one and lets see, if ppl stumble into tons of issues?

And if thats a big issue for peeps, the old addon version would still be around and work as before, so we can fix things iteratively?

@Tyriar Tyriar added this to the 5.1.0 milestone Dec 16, 2022
@jerch
Copy link
Member Author

jerch commented Dec 19, 2022

@Tyriar Oh well, vscode-uri does not help here much, initially I thought it brings a method to identify url offsets within a string, but it is basically an enhanced new URL() nodejs version, but for that we can just stick to the browser's impl. So nothing to be gained here from vscode-uri.

Copy link
Member

@Tyriar Tyriar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, I'll make a note on the release notes about how this is a big change

@Tyriar Tyriar merged commit 84b8121 into xtermjs:master Dec 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment