Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vault.agent.authenticated metric #26570

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

markafarrell
Copy link

This adds an additional metric to vault agent telemetry that allows you to see if vault agent is currently authenticated and has a valid token.

When the metric is set to 1 it means that the agent has successfully authenticated with the vault server and has a valid token.

When it is set to 0 it means that the agent does not have a valid token.

fixes #26569

@markafarrell markafarrell requested a review from a team as a code owner April 22, 2024 03:06
@markafarrell
Copy link
Author

The below can be used to demonstrate the new metric

Generate TLS certificates

mkdir -p container-data/vault/tls
openssl req -x509 -nodes -days 9999 -newkey rsa:2048 \
-keyout  container-data/vault/tls/vault_server.key -out  container-data/vault/tls/vault_server.crt \
-subj "/CN=AU/ST=Some-State/L=Some-City/O=Internet Widgits Pty Ltd/OU=Something/\
CN=vault-server" \
-addext "subjectAltName = DNS:vault-server"
chmod g+r container-data/vault/tls/*

Generate configuration

mkdir -p container-data/vault/config

cat <<EOF > container-data/vault/config/vault_main.hcl
ui = true

listener "tcp" {
  address = "[::]:8200"
  cluster_address = "[::]:8201"
  tls_cert_file = "/vault/tls/vault_server.crt"
  tls_key_file  = "/vault/tls/vault_server.key"
}

storage "file" {
  path = "/vault/data"
}
EOF

Start vault

mkdir -p logs

mkdir -p logs/vault

chmod g+w logs/vault

docker network create vault-agent-test

docker run --rm -d -p 8200:8200 -e VAULT_LOG_LEVEL=debug -e NO_PROXY="vault-server" -v $PWD/container-data/vault/tls/:/vault/tls/ -v $PWD/container-data/vault/config/vault_main.hcl:/vault/config/vault_main.hcl -v $PWD/logs/vault:/var/log/vault --cap-add IPC_LOCK --network=vault-agent-test --name=vault-server hashicorp/vault:1.16.1 server

docker run --rm -it --cap-add IPC_LOCK -e VAULT_CLI_NO_COLOR=1 -e VAULT_ADDR=https://vault-server:8200 -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 operator init | tee logs/init.log

Extract unseal keys and root token

mkdir -p secrets
grep "Unseal Key" logs/init.log | awk -F':' '{ print $2}' | tr -d ' ' | tee secrets/unseal_keys
grep "Initial Root Token" logs/init.log | awk -F':' '{ print $2}' | tr -d ' ' | tr -d '\n' | tr -d '\r' | tee secrets/root_token

Unseal Vault

for k in $(cat secrets/unseal_keys)
do
    docker run --rm -it --cap-add IPC_LOCK -e VAULT_ADDR=https://vault-server:8200 -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 operator unseal $k
done

Enable approle auth

docker run --rm -it --cap-add IPC_LOCK -e VAULT_ADDR=https://vault-server:8200 -e VAULT_TOKEN=$(cat $PWD/secrets/root_token) -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 auth enable approle

Create approle

docker run --rm -it --cap-add IPC_LOCK -e VAULT_ADDR=https://vault-server:8200 -e VAULT_TOKEN=$(cat $PWD/secrets/root_token) -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 \
vault write auth/approle/role/my-role \
    secret_id_ttl=10m \
    token_num_uses=10 \
    token_ttl=20m \
    token_max_ttl=30m \
    secret_id_num_uses=40

mkdir -p secrets/approle

docker run --rm -it --cap-add IPC_LOCK -e VAULT_CLI_NO_COLOR=1 -e VAULT_ADDR=https://vault-server:8200 -e VAULT_TOKEN=$(cat $PWD/secrets/root_token) -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 vault read -field=role_id auth/approle/role/my-role/role-id | tr -d '\n' | tr -d '\r' | tee secrets/approle/role-id; echo

docker run --rm -it --cap-add IPC_LOCK -e VAULT_CLI_NO_COLOR=1 -e VAULT_ADDR=https://vault-server:8200 -e VAULT_TOKEN=$(cat $PWD/secrets/root_token) -e NO_PROXY=vault-server -e VAULT_SKIP_VERIFY=TRUE --network=vault-agent-test hashicorp/vault:1.16.1 vault write -field secret_id -f auth/approle/role/my-role/secret-id | tr -d '\n' | tr -d '\r' | tee secrets/approle/secret-id; echo

Generate Vault Agent configuration

mkdir -p container-data/vault-agent/config

cat <<EOF > container-data/vault-agent/config/vault-agent-conf.hcl
auto_auth {
  method {
    type = "approle"

    config = {
      role_id_file_path = "/etc/vault/approle/role-id"
      secret_id_file_path = "/etc/vault/approle/secret-id"
    }
  }

  sinks {
    sink {
      type = "file"

      config = {
        path = "/tmp/file-foo"
      }
    }
  }
}
listener "tcp" {
  address = "0.0.0.0:8100"
  tls_disable = true
  unauthenticated_metrics_access = true
}

telemetry {
  disable_hostname = true
}

cache {}
EOF

Start Vault Agent

docker run --rm -d -p 8100:8100 -e VAULT_LOG_LEVEL=debug -e VAULT_ADDR=https://vault-server:8200 -e VAULT_SKIP_VERIFY=TRUE -e NO_PROXY="vault-server" -v $PWD/container-data/vault/tls/:/vault/tls/ -v $PWD/container-data/vault-agent/config/vault-agent-conf.hcl:/vault/config/vault-agent-conf.hcl -v $PWD/secrets/approle:/etc/vault/approle:rw --cap-add IPC_LOCK --network=vault-agent-test --name=vault-agent hashicorp/vault:1.16.1 agent -config /vault/config/vault-agent-conf.hcl
docker logs vault-agent

Get Vault Agent Metrics

curl -s -x '' http://127.0.0.1:8100/agent/v1/metrics?format=prometheus | grep vault_agent_auth
# HELP vault_agent_auth_success vault_agent_auth_success
# TYPE vault_agent_auth_success counter
vault_agent_auth_success 2

Start Modified Vault Agent

docker run --rm -d -p 8100:8100 -e VAULT_LOG_LEVEL=debug -e VAULT_ADDR=https://vault-server:8200 -e VAULT_SKIP_VERIFY=TRUE -e NO_PROXY="vault-server" -v $PWD/container-data/vault/tls/:/vault/tls/ -v $PWD/container-data/vault-agent/config/vault-agent-conf.hcl:/vault/config/vault-agent-conf.hcl -v $PWD/secrets/approle:/etc/vault/approle:rw --cap-add IPC_LOCK --network=vault-agent-test --name=vault-agent vault:dev agent -config /vault/config/vault-agent-conf.hcl
docker logs vault-agent

Get Vault Agent Metrics

curl -s -x '' http://127.0.0.1:8100/agent/v1/metrics?format=prometheus | grep vault_agent_auth
# HELP vault_agent_auth_authenticated vault_agent_auth_authenticated
# TYPE vault_agent_auth_authenticated gauge
vault_agent_auth_authenticated 1
# HELP vault_agent_auth_failure vault_agent_auth_failure
# TYPE vault_agent_auth_failure counter
vault_agent_auth_failure 4
# HELP vault_agent_auth_success vault_agent_auth_success
# TYPE vault_agent_auth_success counter
vault_agent_auth_success 2

@markafarrell markafarrell force-pushed the feature/add-agent-authenticated-metric branch 2 times, most recently from b88be6e to aafebf7 Compare April 22, 2024 03:12
@divyaac divyaac added the agent label Apr 22, 2024
@@ -0,0 +1,3 @@
```release-note:feature
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would probably qualify as an improvement!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

ah.logger.Info("renewed auth token")

case <-credCh:
ah.logger.Info("auth method found new credentials, re-authenticating")
ah.logger.Info("autreh method found new credentials, -authenticating")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably revert this change

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@divyaac
Copy link
Contributor

divyaac commented Apr 22, 2024

Hi @markafarrell, thank you so much for your PR! Before we proceed - I'd love to understand your use case a little bit better. Currently, we can see when authentication has succeeded and the agent has a valid token in the server logs (ie. https://github.com/hashicorp/vault/blob/main/command/agentproxyshared/auth/auth.go#L480 ). Is there a reason that telemetry might better serve your needs than the server logs?

@markafarrell
Copy link
Author

@divyaac Having a metric makes it much easier to integrate with alerting tools like Prometheus alert manager.

Then you can get an alert when that metric goes to zero so you can promptly act. Instead of having to look at logs to see that the agent is not authenticated

@divyaac
Copy link
Contributor

divyaac commented Apr 23, 2024

@divyaac Having a metric makes it much easier to integrate with alerting tools like Prometheus alert manager.

Then you can get an alert when that metric goes to zero so you can promptly act. Instead of having to look at logs to see that the agent is not authenticated

Thanks for your response @markafarrell . I think adding this metric would make sense. After addressing the comments we should be able to get move this PR along!

@@ -276,6 +288,8 @@ func (ah *AuthHandler) Run(ctx context.Context, am AuthMethod) error {
if err != nil {
ah.logger.Error("error creating client for wrapped call", "error", err, "backoff", backoffCfg)
metrics.IncrCounter([]string{ah.metricsSignifier, "auth", "failure"}, 1)
// Set unauthenticated when authentication fails
Copy link
Contributor

@divyaac divyaac Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment can be deleted.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -478,6 +511,8 @@ func (ah *AuthHandler) Run(ctx context.Context, am AuthMethod) error {
}

metrics.IncrCounter([]string{ah.metricsSignifier, "auth", "success"}, 1)
// Set authenticated when authentication succeeds
Copy link
Contributor

@divyaac divyaac Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment can be deleted.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -144,12 +144,18 @@ func (ah *AuthHandler) Run(ctx context.Context, am AuthMethod) error {
backoffCfg := newAutoAuthBackoff(ah.minBackoff, ah.maxBackoff, ah.exitOnError)

ah.logger.Info("starting auth handler")

// Set unauthenticated when starting up
metrics.SetGauge([]string{ah.metricsSignifier, "auth", "authenticated"}, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The comment can be deleted (and done for the rest of the new additions)
  2. The name of the metric can be
    metrics.SetGauge([]string{ah.metricsSignifier, "authenticated"}, 0)
    aka, we can remove the "auth" prefix from the string array

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

@divyaac divyaac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the changes are addressed we can move forward!

@markafarrell markafarrell force-pushed the feature/add-agent-authenticated-metric branch from e336c63 to 54ebad9 Compare April 29, 2024 01:54
@markafarrell markafarrell requested review from a team as code owners April 29, 2024 01:54
@markafarrell markafarrell force-pushed the feature/add-agent-authenticated-metric branch 3 times, most recently from f15336a to f60907f Compare April 29, 2024 02:07
@markafarrell markafarrell changed the title Add vault.agent.auth.authenticated metric Add vault.agent.authenticated metric Apr 29, 2024
@markafarrell markafarrell force-pushed the feature/add-agent-authenticated-metric branch from f60907f to a6b2e25 Compare April 30, 2024 01:33
Copy link
Contributor

@schavis schavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc updates lgtm

@markafarrell markafarrell requested a review from divyaac May 3, 2024 05:05
@markafarrell markafarrell force-pushed the feature/add-agent-authenticated-metric branch from aed34b0 to 3bf2951 Compare May 3, 2024 05:12
@markafarrell markafarrell force-pushed the feature/add-agent-authenticated-metric branch from 3bf2951 to 1eb2322 Compare May 6, 2024 01:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to ascertain vault agent authentication status from metrics
3 participants