Add http node attestor #4909

kfox1111 · 2024-02-23T15:20:44Z

Adds an http node attestor

Commit conforms to CONTRIBUTING.md?
Proper tests/regressions included?
Documentation updated?

evan2645

Thank you for this @kfox1111 and for your patience. I left a handful of high level comments/questions. I had many smaller comments that I held back, I think we're clear to move this out of draft and add tests etc whenever you have a chance

evan2645 · 2024-04-09T16:34:36Z

doc/plugin_agent_nodeattestor_httppop.md

@@ -0,0 +1,47 @@
+# Agent plugin: NodeAttestor "httppop"


IMO we should name this plugin something around DNS instead of HTTP, since what we're really attesting is that a DNS entry points at a machine, and the fact that we're confirming it using HTTP is an implementation detail

NodeAttestor "dns" ?

That may be why they called it acme rather then anything more specific....

acme has dns mode and http mode. Both do pretty different things. This plugin is most akin to the acme http protocol and may be a little confusing to people calling it dns as it doesn't do proof of possession over dns txt records like acme dns.

It still may be a little bit clearer I think being httppop, as the proof of possession token is hosted out of a http server. almost all http servers use dns, so thats a bit implied?

It would leave room for an dnspop plugin later that could function like acme dns mode, should that be desirable. (not sure it is)

naming things is hard. :/

This plugin is most akin to the acme http protocol and may be a little confusing to people calling it dns as it doesn't do proof of possession over dns txt records like acme dns.

Hmm ... that is a good point. I feel it's more about reachability when a certain DNS record is used, proving that you can serve traffic for a record .. I also see what you mean by ACME prior art, we are not proving control over DNS, but proving that we can serve a DNS name. I guess I'll spend some more time thinking about it

How about "http_challenge"

evan2645 · 2024-04-09T18:09:39Z

doc/plugin_agent_nodeattestor_httppop.md

+|-------------------|--------------------------------------------------------------------------------------------------------------------------------------------------|-----------|
+| `hostname`        | Hostname to use for handshaking. If unset, it will be automatically detected.                                                                    |           |
+| `agentname`       | Name of this agent on the host. Useful if you have multilpe agents bound to different spire servers on the same host.                            | "default" |
+| `port`            | The port to listen on.                                                                                                                           | 80        |


I think we should choose a random port here by default. Low port numbers require root which we don't always have. Defaulting to a static number can be error prone since the port might be in use.

acme chose to standardize on port 80 as its most likely to make it across the internet unhurt. If we choose a random port, I think we're probably going to get a bunch of support requests asking why the plugin is broken? :/

But maybe I'm assuming something here. Do we think intranet usage will be the most common and spire-agent -> spire-server over the internet will be uncommon?

acme chose to standardize on port 80 as its most likely to make it across the internet unhurt

Well it's also the standard HTTP port for web facing services, and ACME traditionally fills requests for web facing services. This case feels different.

But maybe I'm assuming something here. Do we think intranet usage will be the most common and spire-agent -> spire-server over the internet will be uncommon?

I do think internet traversal of this traffic is uncommon. We use mTLS there, which frequently has trouble across the internet (e.g. corporate and ISP TLS interception). My bet is that cost/benefit will outweigh use of port 80 - cons: root required, may already be in use ... pros: less likely to be filtered by a firewall. Agent/server traffic currently defaults to port 8081. If I can't or don't want to use port 80, I have to choose some other random static port number, which also feels funny

Well it's also the standard HTTP port for web facing services, and ACME traditionally fills requests for web facing services. This case feels different.

I have seen a fair amount of use of certbot in http mode for not webservers. But, it is much more common to usefor webservers.

But maybe I'm assuming something here. Do we think intranet usage will be the most common and spire-agent -> spire-server over the internet will be uncommon?

I do think internet traversal of this traffic is uncommon. We use mTLS there, which frequently has trouble across the internet (e.g. corporate and ISP TLS interception). My bet is that cost/benefit will outweigh use of port 80 - cons: root required, may already be in use ... pros: less likely to be filtered by a firewall. Agent/server traffic currently defaults to port 8081. If I can't or don't want to use port 80, I have to choose some other random static port number, which also feels funny

Where I think it may get used on the internet is for things like edge computing, where you have one spire server on the internet, and then you have a spire-agent at multiple different organizations. Likely in that scenerio, spire-server would be setup on port 443 which would make it out of all the orgs that were deploying the agent, and would need port 80 made back into each organization. So, getting N number of firewall teams to let in a particular port 80 to a host might be easier then random port at N orgs.

Again, not sure how common this will be, but somehow supporting the mode of operation for those that may need to do this kind of thing seems useful.

I'm kind of iffy on the con list too:

cons: root required, may already be in use

I kind of see those in some ways as extra security checks rather then cons. But I agree some may see it that way, and
thats one of the reasons to have flags to allow_non_root_ports and allow_alternate_ports.
Maybe we should have an agent slide "use_random_port" flag too? Then it could be configured both ways.

How about we change up the args...
On the agent, if no port is specified, pick a random one. This still allows port 80 when desired.

On the server, allow all non root and alternate ports by default but still keep them for those that want to lock down the system further?

On the agent, if no port is specified, pick a random one. This still allows port 80 when desired.
On the server, allow all non root and alternate ports by default but still keep them for those that want to lock down the system further?

❤️ I like this much better

evan2645 · 2024-04-09T18:21:23Z

doc/plugin_server_nodeattestor_httppop.md

+| `allow_alternate_ports` | Set to true to allow ports other then 80 to be specified by the agent and honored during the handshake. If false, ports other then 80 will be rejected.   | false                               |
+| `allow_non_root_ports`  | Set to true to allow ports >= 1024 to be used by the agents with the advertised_port                                                                      | false                               |


Is there danger in enabling these? Should we just always allow it?

Some situations I can see (not exhaustive)

For allow_alternate_ports, if you have an internet facing spire-server, agents can make the server spend extra time/resources asking for callbacks to ports that are more likely to get blackholed I think. It could take longer for the server to decide to give up. Forcing it to be just one port controls the issue somewhat. Same with unintentially asking for an arbitrary port and have an intermediate firewall just block non port 80/443 and users wonder why the plugin is broken.

allow_non_root_ports being false adds an extra bit of security to things like NFS does. Say you have a shared unix box where multiple untrusted users can login, but only as their own users (no sudo root). They could http attest with a high port and get their own agent running on the node under their own user, when the system admin wants to use http attestation for the whole node with a root owned agent. These type nodes are common in HPC environments amongst others.

allow_non_root_ports being false adds an extra bit of security to things like NFS does. Say you have a shared unix box where multiple untrusted users can login, but only as their own users (no sudo root). They could http attest with a high port and get their own agent running on the node under their own user, when the system admin wants to use http attestation for the whole node with a root owned agent. These type nodes are common in HPC environments amongst others.

This is an important observation! It is the same problem as most CSP attestors , and there we work around it using TOFU. Should this attestor also have TOFU behavior for a given DNS name?

TOFU sounds like a good option in addition to the allow_non_root_ports feature. It adds a different, and mostly complimentary feature I think.

For bare metal nodes, its more common to need to reuse the same hostname when reinstalling the node then in the cloud I think. Needing to clear out the name out of the spire server so it can be re-provisioned can painful and run into issues with automation. It may be worth it to some users, but not others to use the TOFU option due to this.

Also thinking forward, when spire has support for multiple attestors together so that say, a tpm plugin and an http_challenge plugin are both required, it would be desirable to not TOFU but reattest so that both the tpm and http challenge are valid with regular re-attestation, TOFU wouldn't work in that environment. Unless periodic reattestation could remove TOFU registration when reattesting I guess?

On the hybrid direction, I think that's something we'll need to figure out when it's time to cross that bridge since other attestor types will probably be in the same boat

For here, it does seem like TOFU is needed ... but, if you're root and can bind a low port number, then perhaps we don't need TOFU? I think the multi-tenancy aspect that drives the TOFU requirement assumes those workloads don't have root. If they do, then they own the box anyways. So with that in mind, how about we have a use_privileged_port_number configurable or similar, where you statically configure a port below 1024? The server side attestor can detect the low port number and automatically flip is_reattestable in its response based on that ... ?

evan2645 · 2024-04-09T18:23:36Z

pkg/agent/plugin/nodeattestor/httppop/httppop.go

+	return &Plugin{}
+}
+
+func (p *Plugin) ServeNonce(agentName string, nonce string) (err error) {


I think these functions should be unexported?

evan2645 · 2024-04-09T18:25:53Z

pkg/common/plugin/httppop/httppop.go

+	return &Challenge{Nonce: nonce}, nil
+}
+
+func CalculateResponse(challenge *Challenge) (*Response, error) {


What do you envision happening here? Since the challenge is fulfilled by out of band call back from the server, I don't think we need to send anything back on the stream?

Mostly a copy/paste thing from other plugins that use it...

But, there does need to be a message from the agent to the server after the agent starts up the http server and hosts the token to tell the server it can now try and call it. So, there is a Response, even though its blank.

Should I remove the extra function or leave it for consistency with the other plugins?

kfox1111 · 2024-04-10T00:34:43Z

@evan2645 Thanks for the review and the discussion! All good things to consider. :)

Paul-Luciano-2003

helllo, just saying hi.
i have to build a better domain, pitch in for team players.

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

azdagron

Thanks, @kfox1111. I'm still thinking through the security of the challenge/response but here is some preliminary feedback.

azdagron · 2024-06-04T19:57:41Z

doc/plugin_agent_nodeattestor_http_challenge.md

+
+If `advertised_port` != `port`, you will need to setup an http proxy between the two ports. This is useful if you already run a webserver on port 80.
+
+A sample configuration:


We have two sample configurations in this file....

Fixed an issue with it. but the intention was for the second example to be specifically for the Proxies section... I can see how that could be confusing though. Maybe include "proxy" in the example string for it?

azdagron · 2024-06-04T19:58:18Z

doc/plugin_agent_nodeattestor_http_challenge.md

+```
+
+## Proxies
+


spelling: multilple

azdagron · 2024-06-04T19:58:31Z

doc/plugin_agent_nodeattestor_http_challenge.md

+| Configuration     | Description                                                                                                                                      | Default   |
+|-------------------|--------------------------------------------------------------------------------------------------------------------------------------------------|-----------|
+| `hostname`        | Hostname to use for handshaking. If unset, it will be automatically detected.                                                                    |           |
+| `agentname`       | Name of this agent on the host. Useful if you have multilpe agents bound to different spire servers on the same host and sharing the same port.  | "default" |


spelling: multilpe

azdagron · 2024-06-04T20:00:46Z

doc/plugin_server_nodeattestor_http_challenge.md

+
+| Configuration           | Description                                                                                                                                               | Default                             |
+|-------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------|
+| `dns_patterns`          | A list of regular expressions to apply to the hostname being attested. If none match, attestation will fail. If unset, all hostnames are allowed.         |                                     |


nit : allowed_dns_patterns

It also isn't clear what it means to "apply" the regex, i.e., we should be clear that the hostname must match at least one pattern.

azdagron · 2024-06-04T20:02:31Z

doc/plugin_server_nodeattestor_http_challenge.md

+| `dns_patterns`          | A list of regular expressions to apply to the hostname being attested. If none match, attestation will fail. If unset, all hostnames are allowed.         |                                     |
+| `required_port`         | Set to a port number to require clients to listen only on that port. If unset, all port numbers are allowed                                               |                                     |
+| `allow_non_root_ports`  | Set to true to allow ports >= 1024 to be used by the agents with the advertised_port                                                                      | true                                |
+| `agent_path_template`   | A URL path portion format of Agent's SPIFFE ID. Describe in text/template format.                                                                         | "{{ .PluginName }}/{{ .HostName }}" |


Do we need this? What parameters outside of HostName seem relevant? Is this mostly copy-paste from the other challenge-based attestors or is there a use-case in mind?

azure_msi, gcp_iit, sshpop and x509pop all have it. Just copy/pasted from the plugin I started with, but seems pretty common.

pkg/common/plugin/httpchallenge/httpchallenge.go

azdagron · 2024-06-04T20:48:55Z

pkg/common/plugin/httpchallenge/httpchallenge.go

+	return idutil.AgentID(td, agentPath)
+}
+
+func generateNonce() ([]byte, error) {


I'd suggest the nonce either be represented as raw bytes, or as a hex encoded string. If the latter, this should return a string type.

azdagron · 2024-06-05T16:14:34Z

pkg/server/plugin/nodeattestor/httpchallenge/httpchallenge.go

+	notfound := false
+	for _, re := range config.dnsPatterns {
+		notfound = true
+		l := re.FindAllStringSubmatch(attestationData.HostName, -1)
+		if len(l) > 0 {
+			notfound = false
+			break
+		}
+	}
+	if notfound {
+		return status.Errorf(codes.PermissionDenied, "the requested hostname is not allowed to connect")
+	}


nit: I think it would be cleaner to extract this to a function. The notFound variable wouldn't be needed then (function could early-return if the hostname matches a pattern).

azdagron · 2024-06-05T16:15:33Z

pkg/server/plugin/nodeattestor/httpchallenge/httpchallenge.go

+	l := config.agentNamePattern.FindAllStringSubmatch(attestationData.AgentName, -1)
+	if len(l) != 1 || len(l[0]) == 0 || len(l[0]) > 32 {
+		return status.Error(codes.InvalidArgument, "agent name is not valid")
+	}


nit: can we extract this to a function, e.g. validateAgentName

pkg/server/plugin/nodeattestor/httpchallenge/httpchallenge.go

Co-authored-by: Andrew Harding <azdagron@gmail.com> Signed-off-by: kfox1111 <Kevin.Fox@pnnl.gov>

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

azdagron · 2024-06-05T17:08:35Z

pkg/common/plugin/httpchallenge/httpchallenge.go

+		Path:   fmt.Sprintf("/.well-known/spiffe/nodeattestor/http_challenge/%s/%s", attestationData.AgentName, challenge.Nonce),
+	}
+
+	resp, err := http.Get(turl.String())


Oh, code should use the context so that it can be cancelled (e.g http.NewRequest+req.WithContext+http.DefaultClient.Do(req))

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

kfox1111 requested review from evan2645, amartinezfayo, azdagron, MarcosDY and rturner3 as code owners February 23, 2024 15:20

kfox1111 marked this pull request as draft February 23, 2024 15:20

kfox1111 mentioned this pull request Feb 23, 2024

DNS/HTTP Node Attestor #4788

Open

amartinezfayo assigned evan2645 Feb 27, 2024

evan2645 reviewed Apr 9, 2024

View reviewed changes

Paul-Luciano-2003 approved these changes Apr 16, 2024

View reviewed changes

Paul-Luciano-2003 suggested changes Apr 20, 2024

View reviewed changes

Add http challenge node attestor

b322773

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

kfox1111 force-pushed the http branch from cfbfa0e to b322773 Compare May 10, 2024 16:49

kfox1111 added 12 commits May 11, 2024 09:48

Fix various issues so it works again after refactor

f08d9d2

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

Fix some issues

009166f

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

Fix some issues

7caac6f

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

Fix some issues

3e68d07

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

Fix some issues

baaa4f6

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

Fix some issues

029f117

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

Fix some issues

82af10b

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

Fix some issues

86fbf7c

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

Implement tofu. Incorperate feedback

727c5f9

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

Fix some lint bits

9f42248

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

More lint

76d761f

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

More lint

a693947

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

kfox1111 marked this pull request as ready for review May 16, 2024 22:42

kfox1111 added 2 commits May 16, 2024 16:57

Merge branch 'main' into http

28dc797

Merge branch 'main' into http

6416f01

azdagron self-assigned this Jun 4, 2024

azdagron reviewed Jun 5, 2024

View reviewed changes

kfox1111 and others added 4 commits June 5, 2024 09:47

Apply suggestions from code review

9e8897e

Co-authored-by: Andrew Harding <azdagron@gmail.com> Signed-off-by: kfox1111 <Kevin.Fox@pnnl.gov>

Incorperate feedback

fdfc837

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

Merge branch 'http' of https://github.com/kfox1111/spire into http

f8bc768

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

Incorperate feedback

b5a439e

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

azdagron reviewed Jun 5, 2024

View reviewed changes

kfox1111 added 4 commits June 5, 2024 13:02

Incorperate feedback

0656513

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

Fix example

6a94dfd

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

Incorperate feedback

4e8133e

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

Incorperate feedback

b0df1d6

Signed-off-by: Kevin Fox <Kevin.Fox@pnnl.gov>

		\| `allow_alternate_ports` \| Set to true to allow ports other then 80 to be specified by the agent and honored during the handshake. If false, ports other then 80 will be rejected. \| false \|
		\| `allow_non_root_ports` \| Set to true to allow ports >= 1024 to be used by the agents with the advertised_port \| false \|


		If `advertised_port` != `port`, you will need to setup an http proxy between the two ports. This is useful if you already run a webserver on port 80.

		A sample configuration:

		```

		## Proxies

Add http node attestor #4909

Are you sure you want to change the base?

Add http node attestor #4909

Conversation

kfox1111 commented Feb 23, 2024 • edited

evan2645 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kfox1111 Apr 10, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kfox1111 Apr 10, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kfox1111 commented Apr 10, 2024

Paul-Luciano-2003 left a comment

Choose a reason for hiding this comment

azdagron left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kfox1111 commented Feb 23, 2024 •

edited

kfox1111 Apr 10, 2024 •

edited

kfox1111 Apr 10, 2024 •

edited