Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Response::remote_addr() #489

Merged
merged 5 commits into from Oct 3, 2022
Merged

Add Response::remote_addr() #489

merged 5 commits into from Oct 3, 2022

Conversation

mvforell
Copy link
Contributor

@mvforell mvforell commented Mar 5, 2022

Fixes #488.

@mvforell mvforell changed the title Add Response::remote_addr() (fixes #488) Add Response::remote_addr() Mar 5, 2022
src/response.rs Outdated
@@ -226,6 +229,11 @@ impl Response {
charset_from_content_type(self.header("content-type"))
}

/// The remote address that sent this response.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Two questions.

I think we need to document why this sometimes can be None – is it because of tests?

I also wonder what happens for SOCKS proxies. Does reqwest do anything special (X-Forwarded-For)? Is it expected to do anything?

Copy link
Contributor Author

@mvforell mvforell Mar 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's both because of tests (Stream::socket() returns an Option<_> because Inner::socket() does so) and because TcpStream::peer_addr() returns a Result<_>. What do you think of adding

This may return None in tests or if the internal call to std::net::TcpStream::peer_addr() returns an Err.

to the doc?

reqwest relies on hyper, which also just uses TcpStream::peer_addr() (see https://docs.rs/hyper/0.14.17/src/hyper/client/connect/http.rs.html#364). I'm not familiar with SOCKS proxies, so I don't know what happens for them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you had time to take another look? I don't want to rush you of course

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning towards panic if we use this function and it's constructed for a test.

It's because the test case usage is so rare, it feels strange to base the API on that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not just for tests - TcpStream::peer_addr() can return an error based on what the OS returns. Tracking through the stdlib, it looks like Rust calls out to getpeername(7), which on Linux is:

https://man7.org/linux/man-pages/man2/getpeername.2.html#ERRORS

       EBADF  The argument sockfd is not a valid file descriptor.

       EFAULT The addr argument points to memory not in a valid part of
              the process address space.

       EINVAL addrlen is invalid (e.g., is negative).

       ENOBUFS
              Insufficient resources were available in the system to
              perform the operation.

       ENOTCONN
              The socket is not connected.

       ENOTSOCK
              The file descriptor sockfd does not refer to a socket.

I think we could maybe encounter ENOTCONN, for instance if the connection was closed by the remote end between syscalls. I would hate to panic in that case. And ENOBUFS seems possible in some rare cases.

I'm leaning towards the Option, even though it should be quite rare for it to be None.

Actually, though - this is making me realize two things: (a) this PR adds an extra syscall to every response, even for users who never plan to call remote_addr() (most users). And (b) we shouldn't have to pay a syscall to find out what the remote address is, because we already picked a remote address to connect to.

@mvforell what do you think of modifying this so instead ureq keeps track of the IP address it connected to, and passes that along to the Response?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll gladly remove the syscall if we don't need it, but I am unable to figure out where ureq "already picked a remote address to connect to." Could you maybe point me to the relevant part of the code?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're talking about two places where "resolver" is used:

https://github.com/algesten/ureq/blob/main/src/stream.rs#L358-L362
https://github.com/algesten/ureq/blob/main/src/stream.rs#L458-L463

In both cases we pick [0] from the returned addresses, and what we need to do is squirrel away that result to make it available on Response.

@mvforell
Copy link
Contributor Author

mvforell commented May 23, 2022

Sorry for the delay, I've finally found time to look into this again.

I've pushed a new implementation that doesn't use TcpStream::peer_addr() and where Response::remote_addr() returns a SocketAddr instead of an Option<SocketAddr>.

I'm still unsure about two things regarding this new implementation:

  1. I'm not familiar with SOCKS proxies. I think with this implementation, Response::remote_addr() returns the socket address of the proxy. I'm not sure if this is the wanted behaviour; if not, I would need a few more pointers on how to change it so that the actual server's address is returned. Either way, this should probably be documented, e.g. by adding "When using a SOCKS proxy, this returns the proxy's address." to Response::remote_addr()'s documentation.
  2. When using Stream::from_vec(), it is not clear what the remote address should be. If I understand correctly, this method is only used for tests and for Response::from_str(). In tests, the remote address isn't used; Response::from_str() already hard-codes example.com as the request URL, so I took the liberty to hard-code the remote address to 0.0.0.0:0. Please let me know if you disagree with this.

@mvforell mvforell reopened this May 23, 2022
@mvforell
Copy link
Contributor Author

Whoops, I have to admit I forgot about this. 😅

I'd be happy to rebase and resolve the conflicts, but before diving back into ureq's code, it'd be great if you could give me some feedback regarding the current implementation and my two concerns mentioned in the last comment.

Copy link
Owner

@algesten algesten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Minimal changes. Nice!

@mvforell
Copy link
Contributor Author

Glad to hear that! :) I've merged the upstream changes, please take a final look.

@algesten
Copy link
Owner

Would you mind rebasing off the latest main, so we can see a clean test run?

@mvforell
Copy link
Contributor Author

Done

@mvforell
Copy link
Contributor Author

Alright, now that I ran cargo test with all the features enabled all the tests should compile and pass 😅

@algesten
Copy link
Owner

There's also the test.sh bash script that runs a few of the feature flag combos the CI runs.

@mvforell
Copy link
Contributor Author

Good to know!

Copy link
Owner

@algesten algesten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy landing this now. I was briefly wondering if it would be better to squirrel away the remote socket address in Unit instead of Stream, but then I realized we need the remote address also when fetching the connection from the Pool, so inside Stream makes more sense.

@jsha Does this look OK to you?

@algesten
Copy link
Owner

algesten commented Oct 1, 2022

@mvforell There are some conflicts against main.

@mvforell
Copy link
Contributor Author

mvforell commented Oct 1, 2022

@mvforell There are some conflicts against main.

Hm, GitHub says "This branch has no conflicts with the base branch", and my local checkout says:

> git fetch upstream && git merge upstream/main
Already up to date.

@algesten
Copy link
Owner

algesten commented Oct 1, 2022

@mvforell sorry! must have had an old browser tab or something.

Copy link
Collaborator

@jsha jsha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks terrific! Thanks for writing this.

@jsha jsha merged commit 855f20e into algesten:main Oct 3, 2022
@mvforell
Copy link
Contributor Author

mvforell commented Oct 3, 2022

I'm really happy to see this merged, thank you for your feedback and pointers!

Do you have an estimate for when the next version including this will be released?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Suggestion: add Response::remote_addr() method
3 participants