Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't set a user-agent header by default. #540

Closed
wants to merge 1 commit into from

Commits on Jun 7, 2019

  1. Don't set a user-agent header by default.

    I'm not entirely sure why this behavior was added to begin with. It was
    originally added in 1259128, which only
    has "add native-tls and serde json support" as the commit message, so
    I'm not even sure if it was intentional or just meant to temporarily
    work around some other issue.
    
    Reqwest is not a user agent, and should not identify itself as such. To
    give a more concrete example of what I mean, `curl` the CLI tool which
    is directly making a request on behalf of a user identifies itself as
    such. `libcurl` the library used to programatically make HTTP requests
    (e.g. to build a user agent) does not.
    
    Why does this matter? Some services, including GitHub and crates.io
    choose to block traffic that does not provide a user agent. Services
    which do this typically do so for a reason. Ultimately `reqwest/0.9.18`
    is no more useful to someone operating a service than an empty string.
    If folks have a legitimate privacy concern with setting one, they can
    still set it to "hello" or whatever non-identifying value they want --
    but by setting a default user agent, that choice is being made for them,
    and they may not even be aware that the service they're hitting is
    requesting a unique user agent.
    
    If someone makes a request to crates.io without supplying a user agent
    header, they'll receive the following response:
    
        We require that all requests include a `User-Agent` header.  To allow us to determine the impact your bot has on our service, we ask that your user agent actually identify your bot, and not just report the HTTP client library you're using.  Including contact information will also reduce the chance that we will need to take action against your bot.
    
        Bad:
          User-Agent: reqwest/0.9.1
    
        Better:
          User-Agent: my_crawler
    
        Best:
          User-Agent: my_crawler (my_crawler.com/info)
          User-Agent: my_crawler (help@my_crawler.com)
    
        If you believe you've received this message in error, please email help@crates.io and include the request id {}.
    
    But if they're using reqwest, they won't get the opportunity to think
    about this, I end up blocking their traffic because their request rate
    is slightly too high but I have no other way to politely ask them to
    slow it down, and everybody is worse off for it.
    
    The majority of APIs that I've interacted with over the years don't
    require a user agent, so most users of reqwest shouldn't ever notice
    this change. This of course will break some users (anyone who's using
    reqwest to hit crates.io and not setting an explicit UA for example),
    but I believe the benefit for both the people writing bots and those
    operating services they're hitting is worth it.
    sgrif committed Jun 7, 2019
    Configuration menu
    Copy the full SHA
    10ea3c7 View commit details
    Browse the repository at this point in the history