Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rack::Request#host returns truncated data for domain names containing underscores, other characters #1604

Closed
pvande opened this issue Feb 15, 2020 · 2 comments · Fixed by #1606

Comments

@pvande
Copy link
Contributor

pvande commented Feb 15, 2020

In Rack 2.1.2, Rack::Request#host did no hostname validation, returning everything from #host_with_port up to the colon.

      def host
        # Remove port number.
        h = host_with_port
        if colon_index = h.index(":")
          h[0, colon_index]
        else
          h
        end
      end

This did not support IPv6 well, so #1561 was implemented to solve that. This changed the implementation of #host to one that relied on regular expressions. This was released in Rack 2.2.0.

      def host
        split_authority(self.authority)[0]
      end

      def authority
        forwarded_authority || host_authority || server_authority
      end

      def forwarded_authority
        if value = get_header(HTTP_X_FORWARDED_HOST)
          wrap_ipv6(split_header(value).first)
        end
      end

      def host_authority
        get_header(HTTP_HOST)
      end

      def server_authority
        host = self.server_name
        port = self.server_port

        if host
          if port
            "#{host}:#{port}"
          else
            host
          end
        end
      end

      def split_authority(authority)
        if match = AUTHORITY.match(authority)
          if address = match[:ip6]
            return match[:host], address, match[:port]&.to_i
          else
            return match[:host], match[:host], match[:port]&.to_i
          end
        end

        # Give up!
        return authority, authority, nil
      end

      AUTHORITY = /
        # The host:
        (?<host>
          # An IPv6 address:
          (\[(?<ip6>.*)\])
          |
          # An IPv4 address:
          (?<ip4>[\d\.]+)
          |
          # A hostname:
          (?<name>[a-zA-Z0-9\.\-]+)
        )
        # The optional port:
        (:(?<port>\d+))?
      /x

This provided not only better IPv6 support, but also did some validation of hostnames, as per RFC 952.

It was soon discovered that this was improperly parsing hostnames that were valid under the more lenient RFC 1123 (permitting hostnames to begin with digits and hyphens) as IP addresses (#1590), so in #1591, the regular expression was updated to match [an entire line of] the authority. This was released as Rack 2.2.2.

      AUTHORITY = /^
        # The host:
        (?<host>
          # An IPv6 address:
          (\[(?<ip6>.*)\])
          |
          # An IPv4 address:
          (?<ip4>[\d\.]+)
          |
          # A hostname:
          (?<name>[a-zA-Z0-9\.\-]+)
        )
        # The optional port:
        (:(?<port>\d+))?
      $/x

The Issue

While the match being performed in the AUTHORITY regex is technically correct for hostnames, it is not correct for DNS names (as per RFC 2181, Section 11). This also causes issues in environments like Docker, where some tools (e.g. Docker Compose) implicitly create hostnames containing underscores. These hostnames are being truncated in Rack 2.2.0, and being left unprocessed in Rack 2.2.2.

Suggestions

Given RFC 2181 relaxes all restrictions on DNS names (except segment and total length), it seems unnecessarily risky to use anything but a wildcard match in that position. IP addresses, however, have a very predictable form, and regular expressions for matching them correctly are already a part of the Ruby standard library (see Resolv::IPv4::Regex and Resolv::IPv6::Regex) in every supported version of Ruby.

@ioquatix
Copy link
Member

Thanks for the detailed explanation. I agree everything you say.

I need a little bit more time to review, but I'm open making this behaviour less specific. Do you want to make a PR to start with?

@pvande
Copy link
Contributor Author

pvande commented Feb 15, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants