Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Vault managed intermediate PKI bug #15525

Merged
merged 5 commits into from
Nov 28, 2022
Merged

Conversation

kisunji
Copy link
Contributor

@kisunji kisunji commented Nov 21, 2022

Description

If you are 1) using Vault as a Consul Connect CA, 2) using Vault-managed policies (i.e. you are bringing external existing Vault mounts to use as CA instead of giving Consul privileges to make mounts itself), and 3) have an empty intermediate PKI mount, Consul will fail to startup the CA Manager.

People have gotten around this issue by adding a dummy cert in the intermediate PKI mount but this PR aims to fix the root cause.

Testing & Reproduction steps

  • Added test that fails without the code changes

Links

Vault as CA reference: https://developer.hashicorp.com/consul/docs/connect/ca/vault#vault-acl-policies

PR Checklist

  • updated test coverage
  • external facing docs updated
  • not a security concern

Sorry, something went wrong.

@github-actions github-actions bot added the theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies label Nov 21, 2022
@kisunji kisunji added the theme/consul-vault Relating to Consul & Vault interactions label Nov 21, 2022
Comment on lines +394 to +396
// If this is the first time calling setupIntermediatePKIPath, the backend
// will not have been initialized. Since the mount is ready we can suppress
// this error.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This else if block is the bugfix

Comment on lines 403 to 424
if v.isConsulMountedIntermediate {
// This codepath requires the Vault policy:
//
// path "/sys/mounts/<intermediate_pki_path>/tune" {
// capabilities = [ "update" ]
// }
//
err := v.tuneMountNamespaced(v.config.IntermediatePKINamespace, v.config.IntermediatePKIPath, &mountConfig)
if err != nil {
v.logger.Warn("Intermediate PKI path was mounted by Consul but could not be tuned",
"namespace", v.config.IntermediatePKINamespace,
"path", v.config.IntermediatePKIPath,
"error", err,
)
}

} else {
v.logger.Info("Found existing Intermediate PKI path mount",
"namespace", v.config.IntermediatePKINamespace,
"path", v.config.IntermediatePKIPath,
)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since tuning the mount is only relevant for Consul-managed, I added a conditional to skip since otherwise you'd see an ominous WARN in the logs on startup.

This isn't the cleanest way to determine if the provider is "Consul-managed" or "Vault-managed" but introducing explicit configuration requires some more consideration around default/missing values.

Copy link
Contributor

@jkirschner-hashicorp jkirschner-hashicorp Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If MaxLeaseTTL is set by the operator, tuning this mount really only matters if the mount's MaxLeaseTTL is lower than the IntermediateCertTTL. Otherwise, tuning the config has no effect, as the intermediate CA cert will still be issued with the requested TTL.

I think I heard, though haven't tested, that if you ask Vault to generate a cert (i.e., intermediate CA cert) with a TTL greater than MaxLeaseTTL, the operation will still succeed, the TTL will just be truncated to MaxLeaseTTL. So everything will still work, it's just that the intermediate CA cert will expire sooner than configured. (Unless there's something about our "attempt to renew at 50% of lifetime" math that breaks because it depends on IntermediateCertTTL rather than the TTL of the actual issuer cert.)

I don't know what the default behavior of MaxLeaseTTL is if not configured by the Vault operator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its somewhat short: https://developer.hashicorp.com/vault/docs/concepts/tokens#the-general-case -- 32 days.

You'll definitely want to read the cert's actual validity period rather than assuming the TTL is OK, not sure if that's useful in this PR or a follow-up though.

Also, I think in 1.11+, we've started adding a warning in this event, but obviously that doesn't help Vault <= 1.10...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed that we do work based on the cert's actual validity period:

if lessThanHalfTimePassed(c.timeNow(), intermediateCert.NotBefore, intermediateCert.NotAfter) {

}
}

// Create the role for issuing leaf certs if it doesn't exist yet
// Create the role for issuing leaf certs
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating comment to match code, since it no longer checks for existence

Copy link
Contributor

@cipherboy cipherboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another ACK from the Vault side of things, looks nice @kisunji!

@kisunji kisunji force-pushed the kisunji/vault-managed-inter-pki branch from 3cb86e6 to dfca9ca Compare November 22, 2022 14:54
// Required to determine if we should tune the mount
// if the VaultProvider is ever reconfigured.
v.isConsulMountedIntermediate = true

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On an earlier meeting did we decide that this needed a fallthrough like the root-generate switch had?

Copy link
Contributor Author

@kisunji kisunji Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we talked I thought I should be initializing the mount from within setupIntermediatePKIPath but realized it wasn't necessary to do that here. So all we need to do is prevent error-ing out on ErrBackendNotInitialized; other codepaths like generateIntermediateCSR below will initialize the mount as necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we do end up mounting the intermediate PKI, there's no other action to be done (unlike GenerateRoot which is actually responsible for mounting+initializing+generating).

I thought about making the code paths for root and intermediate more parallel to each other but deferred it to future Chris

@kisunji kisunji marked this pull request as ready for review November 22, 2022 20:56
}
// Required to determine if we should tune the mount
// if the VaultProvider is ever reconfigured.
v.isConsulMountedIntermediate = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this would only get set to true the FIRST time a consul instance ran through this code. After the mount existed it wouldn't set this if it matters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I intended it to be a "best guess" field that only sets if the VaultProvider EVER creates the intermediate mount. If the mount ever existed then we'll never attempt to tune the mount but output a log to help operators diagnose it if they're ever confused by the behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies theme/consul-vault Relating to Consul & Vault interactions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants