Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recursor: respect DO bit in incoming queries #2196

Merged
merged 5 commits into from May 11, 2024

Conversation

japaric
Copy link
Contributor

@japaric japaric commented Apr 25, 2024

At the moment this implements the first half of RFC 4035, section 3.2.1: the DO bit is set in the queries the recursive resolver sends on behalf of the client. On the wire, I can see that the name servers it communicates with include extra DNSSEC records in their response to the resolver. However, the Recursor removes those extra DNSSEC records before answering the client. That stripping of DNSSEC records does the opposite of the second half of what the RFC dictates (when the DO bit is set).

AFAICT, the stripping of those DNSSEC records happens here. That function caches all the received records but only responds with the ones that exactly match the query and as DNSSEC records like RRSIG were not included in the original query it discards them.

I still have to figure out how to best avoid stripping so I'm leaving this as a draft PR for now.

P.S. I'm checking conformance with the tests in ferrous-systems/dnssec-tests#53

cc #2193

@bluejekyll
Copy link
Member

hm, this is definitely possible. There's a lot of special logic in place for DNSSEC on the authority, it's possible we're doing something incorrect there.

Copy link
Collaborator

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add some basic tests here?

@japaric japaric marked this pull request as ready for review April 29, 2024 15:28
@japaric
Copy link
Contributor Author

japaric commented Apr 29, 2024

I've updated the PR. now section 3.2.1 of RFC4035 is fully covered. all the tests in ferrous-systems/dnssec-tests#53 associated to that RFC section pass

delv is also closer to working but DS queries need special handling and that still needs to be implemented (section 3.1.4.1 of RFC4035) -- I think it makes sense to implement that next

  • without this PR
$ delv -p 1053 @127.0.0.1 +rtrace www.example.com.
;; fetch: www.example.com/A
;; fetch: com/DS
;; no valid RRSIG resolving 'com/DS/IN': 127.0.0.1#1053
;; broken trust chain resolving 'www.example.com/A/IN': 127.0.0.1#1053
;; resolution failed: broken trust chain
  • with this PR
$ delv -p 1053 @127.0.0.1 +rtrace www.example.com.
;; fetch: www.example.com/A
;; fetch: example.com/DNSKEY
;; fetch: example.com/DS
;; fetch: com/DS
;; no valid RRSIG resolving 'com/DS/IN': 127.0.0.1#1053
;; broken trust chain resolving 'example.com/DS/IN': 127.0.0.1#1053
;; broken trust chain resolving 'example.com/DNSKEY/IN': 127.0.0.1#1053
;; broken trust chain resolving 'www.example.com/A/IN': 127.0.0.1#1053
;; resolution failed: broken trust chain

(this fails because the query DS example.com. returns an empty answer section. that was also the case before this PR)

  • for reference, cloudflare's recursive resolver 1.1.1.1
$ delv @1.1.1.1 +rtrace www.example.com.
;; fetch: www.example.com/A
;; fetch: example.com/DNSKEY
;; fetch: example.com/DS
;; fetch: com/DNSKEY
;; fetch: com/DS
;; fetch: ./DNSKEY
; fully validated

@bluejekyll
Copy link
Member

What is delv is that like dig? It might not be documented, but I added a CLI to the project like dig a while back simply called dns, not sure if that's interesting to you: https://github.com/hickory-dns/hickory-dns/blob/main/util/src/bin/dns.rs

@japaric
Copy link
Contributor Author

japaric commented May 7, 2024

What is delv is that like dig?

delv is like dig in that it has a similar CLI and also does resolution but in addition to that it sends extra queries (to a recursive resolver) requesting DNSSEC records and then performs the validation of the entire chain of trust itself, on the client side. In contrast, most DNS clients rely on the recursive resolver doing the DNSSEC validation on their behalf and trust what the server reports back in the response's Authenticated Data (AD) bit / flag.

delv is used as a tool to debug / check if DNSSEC records / settings have correctly been set up in (authoritative / non-recursive) name servers. We use delv (and dig) in conformance tests to check that the DNS servers, both recursive resolvers and name servers, adhere to RFC specifications. We prefer to use "battle tested" third-party tools for these checks instead of relying on binaries / tools that depend on the code that's being tested.

@japaric
Copy link
Contributor Author

japaric commented May 7, 2024

Should we add some basic tests here?

it would be possible to do some unit testing around the behavior of the edns_set_dnssec_ok field, e.g. that build_message adds the right OPT pseudo-record to the message, but testing at a higher level would be way more involved.

there are comprehensive end-to-end tests for these RFC requirements in the conformance test suite ( ferrous-systems/dnssec-tests#53 and ferrous-systems/dnssec-tests#53 ). it would be less work to just rely on those


I fixed the CI failure (clippy warning) and can squash the commits if desired

@djc
Copy link
Collaborator

djc commented May 7, 2024

I fixed the CI failure (clippy warning) and can squash the commits if desired

Would be nice to squash the clippy commit into the originating commit. Otherwise it looks like these commits are at least somewhat logically independent?

when the recursor is "security-aware" -- that is the "dnssec" feature is
enabled -- as per RFC 4035 section 3.2.1
@japaric
Copy link
Contributor Author

japaric commented May 8, 2024

last two commits implement the build-pattern configuration API for Recursor I described in this comment as well as the named.toml validation and feature-gated API suggested by djc in this other comment.

I used the #[serde(deny_unknown_fields)] attribute / feature and it produces this error message

could not read config /etc/named.toml: Error { kind: TomlDecode(Error { kind: Custom, line: Some(0), col: 0, at: Some(0), message: "unknown field `security-aware`, expected one of `roots`, `ns_cache_size`, `record_cache_size`, `security_aware`", key: ["zones", "stores"] }) }

which looks quite useful to me, although it could be slightly better formatted instead of showing the Debug representation of toml::de::Error (that's the way to display errors that was already implemented)

Copy link
Member

@bluejekyll bluejekyll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Thanks for working through all of this @djc and @japaric!

@bluejekyll bluejekyll merged commit 97e1f43 into hickory-dns:main May 11, 2024
18 checks passed
@japaric japaric deleted the ja-recursor-respects-do branch May 13, 2024 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants