Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: interaction of Host and OpenAPI servers #36

Open
adriangb opened this issue Feb 1, 2022 · 4 comments
Open

feat: interaction of Host and OpenAPI servers #36

adriangb opened this issue Feb 1, 2022 · 4 comments

Comments

@adriangb
Copy link
Owner

adriangb commented Feb 1, 2022

Should they interact? If so, how are they related?

@sm-Fifteen
Copy link

sm-Fifteen commented Feb 7, 2022

The Host header should probably be disregarded as far as URL building is concerned. That header is not meant for that and you run into problems with reverse proxies (encode/starlette#843) when you try using it that way. A reverse proxy not replacing the HTTP Host header when passing a request to an application server violates that section of the HTTP 1.1 specification.

deep breath

With that said, there is currently a huge void across all HTTP/HTML specs for applications that don't statically know what their canonical root path is. Host is not meant to be used by an application to determine what its URL authority should be (because there might be a reverse proxy changing the header, and the reverse proxy has to change the header because there might be a second reverse proxy downstream), yet in most web applications and frameworks, that's the only time where the Host header will be acknowledged. RFC 7230 says you can use the Host header to build the effective request URI, as in the one from the point of view of the machine that directly contacted the application server, but that might not be valid to the client. There's really no universally correct way of handling this automatically, your application just has to know what its authority and root path are within its own configuration or break the specification in one or many ways.

There's a similar issue for static client-side applications (especially single-page apps with routing) that may be served from any base path, such that an SPA being served from localhost/foo/bar instead of localhost/ cannot be dynamically informed of thier own URL resolution context, in the same way as the <base> HTML tag does, (with its own issues). The Content-Location header used to work for this, but that specific use case has been deprecated between RFC 2616 and RFC 7231 with the (completely sound and valid) reason that it made that same header wear multiple hats that really didn't go well together:

The definition of Content-Location has been changed to no longer
affect the base URI for resolving relative URI references, due to
poor implementation support and the undesirable effect of potentially
breaking relative links in content-negotiated resources.
(Section 3.1.4.2)

All of that to say: HTTP is really not built to handle resources and servers that don't know their base URL in advance and attempting to do that with the way things currently are is going to make your life miserable in all sorts of ways.

Don't try and use Host for anything if you value your sanity.

@adriangb
Copy link
Owner Author

adriangb commented Feb 7, 2022

Just to clarify: I was referring to the Host class from Starlette, not the Host header (although the Host class uses the Host header for routing...). My thought was that if you are saying "these routes only accept requests from XYZ host" it might make sense to add that host as a server in OpenAPI (and override other servers) for those paths. Sorry if I caused miscommunication between the Host route/routing and Host header.

This said, if I understand your point correctly (it is a lot of in depth information, I really appreciate it but will have to read it several times to absorb) routing based on Host is itself a questionable idea? Which would mean maybe we shouldn't directly support it in Xpresso (the opposite of my initial thought, which was to better support it).

@sm-Fifteen
Copy link

sm-Fifteen commented Feb 7, 2022

Ah, my bad, reading Host capitalized in relation to the OAS server section made me jump to the conclusion that this is what you meant. I looked into this pretty in-depth a year or two back to try and untangle Starlette's url_for situation (see the Starlette issue linked in the previous post, encode/starlette#843) and to see what was the correct solution to this, only to find out that the current state of things on the HTTP side frustratingly had nothing to handle this sort of use case and that it was more or less a lost cause of either disregarding a bunch of specs to make this work (in a way that's likely incompatible with a bunch of setups) or to give up and keep a URL generation implementation that's going to break in most production setups.

Regardless of that, though, yes, Starlette's host-based routing runs the risk of having the same issues regarding reverse-proxies. I don't know how something like, say, Stack Overflow handles having their own multi-tenant application available from hundred of different domain names, though it clearly works for them.

One important thing to note here is that the architecture of Stack Overflow & Stack Exchange Q&A sites is multi-tenant. This means that if you hit stackoverflow.com or superuser.com or bicycles.stackexchange.com, you’re hitting the exact same thing. You’re hitting the exact same w3wp.exe process on the exact same server. Based on the Host header the browser sends, we change the context of the request. Several pieces of what follows will be clearer if you understand Current.Site in our code is the site of the request. Things like Current.Site.Url() and Current.Site.Paths.FaviconUrl are all driven off this core concept.

[source]

I'm guessing they deal with the Host header at the edge and pass it to applicatons via some special header, but honestly I'm kind of at a loss.

@adriangb
Copy link
Owner Author

adriangb commented Feb 7, 2022

The way I usually do it is at the reverse proxy level. And then I add an environment variable that just hardcodes the host / path prefix so that I just concatenate that with the relativel path I want to redirect to.

Regardless, it seem like the most prudent thing is probably to punt on this feature (and any Integrations specific to Host routing) unless someone comes asking for it with a convincing use case. Of course the routing part will still work (it's coming straight from Starlette), but it won't impact the OpenAPI docs or anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants