Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use web_sys::Url instead of url::Url on wasm32 targets #2050

Closed
micolous opened this issue Nov 27, 2023 · 4 comments
Closed

Use web_sys::Url instead of url::Url on wasm32 targets #2050

micolous opened this issue Nov 27, 2023 · 4 comments

Comments

@micolous
Copy link

micolous commented Nov 27, 2023

On wasm32, url adds 279KB to the compiled binary size. A significant chunk of that is because of IDN support, which requires a large look-up table.

wasm32 targets have web_sys::Url, which provides similar functionality to url::Url at much a smaller binary size. It even supports IDN on modern browsers, which have their own internal look-up table.

reqwest should be able to use web_sys::Url instead of url::Url on wasm32 targets, or some other generic url::Url-like trait.

Detail / background

There have been attempts in the url crate to make idna support optional to cut down most of this size, however they have been rolled back, and progress appears stalled.

The current upstream-recommended work-around is to patch out the idna crate with a replacement which implements the same API. There aren't any browser/JS APIs which provide IDN support in a way that could replace the idna crate. One could break IDNA support entirely – but that's not great.

However, wasm32 targets do expose the web_sys::Url type, which supports IDN just fine on modern browsers.

The request here is for reqwest to be able to use web_sys::Url instead of url::Url on wasm32 targets.

That look-up table overhead exists on all other platforms as well, and it may be better to have reqwest define a trait that allows easily swapping out the Url implementation.

@micolous
Copy link
Author

I've had a look into this:

  • reqwest already defines an IntoUrl trait, which converts String-like values into a url::Url, and checks the scheme.

    Making this work with web_sys::Url would be pretty straight forward.

  • reqwest only seems to use url::Url to validate a URL - and on hyper it converts it to the http::Uri type!

    Having two URL parsing libraries isn't good, but http::Uri doesn't implement IDN, so using that exclusively would be a regression for IntoUrl.

  • There's a wasm-specific version of the Request type, which owns a url::Url as a field. It also exposes that url::Url via Request::url{,_mut}(). This would need to change, and there's a few options:

    • Request::url could change to just web_sys::Url on wasm targets. That would break the existing API on wasm only, and there is no impl From for either type.

      This couldn't be a feature because it would violate the "features are additive" principle.

    • Request::url could be swapped out with an enum that corresponds either web_sys::Url or url::Url; but that would break the existing API.

      Different features could be additive within this enum.

    • Request::url could become a generic parameter that accepts either url::Url or web_sys::Url using a different trait (because IntoUrl is being used to validate Strings) wasm::client::fetch() converts the URL value to a String to pass to JS, so it doesn't seem to be important what the underlying implementation is.

      Becoming a type parameter would impact users storing the request::Request, as that would now have a type parameter on wasm.

      Things like IntoUrl could only give one type, and that'd change depending on the platform.

I'd lean towards making web_sys::Url the Url type on WASM platforms; mainly because all of the options result in some API breakage, but implementation is less complicated, and the API becomes "use the native Uri type for the platform".

One could go for http::Uri instead, but that would break IDN support.

@seanmonstar
Copy link
Owner

A couple of unsorted thoughts:

  • Depending on Url is at this point just legacy. I've desired to remove that dependency, it's a breaking change (though one is coming up anyways), and a few other things would need a replacement (like the query() method).
  • We can't make the type different on WASM (url::Url vs web_sys::Url), not publicly, without breakage. Some who are compiling to WASM may already be expecting it to be that exact type.

@micolous
Copy link
Author

Yeah, breaking APIs is the biggest blocker for all of this.

I've also had a look into whether something could be done on the url side, and servo/rust-url#887 is one such option.

@seanmonstar
Copy link
Owner

So at least for now, I don't think we can make this specific change. We might remove url entirely, but that's a separate thing.

@seanmonstar seanmonstar closed this as not planned Won't fix, can't repro, duplicate, stale Dec 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants