Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HashiCorp Vault Signed SSH broken since update to 20.0.0 #11842

Closed
3 of 6 tasks
al-lac opened this issue Mar 3, 2022 · 4 comments
Closed
3 of 6 tasks

HashiCorp Vault Signed SSH broken since update to 20.0.0 #11842

al-lac opened this issue Mar 3, 2022 · 4 comments

Comments

@al-lac
Copy link

al-lac commented Mar 3, 2022

Please confirm the following

  • I agree to follow this project's code of conduct.
  • I have checked the current issues for duplicates.
  • I understand that AWX is open source software provided for free and that I might not receive a timely response.

Summary

We used a signed ssh key for accessing some of ours hosts. For this we use a credential of the type HashiCorp Vault Signed SSH. Since upgrading from 19.4.0 to 20.0.0 we cannot access hosts that need a signed ssh key anymore.

Before we would get this in the Job Logs:

Identity added: /runner/artifacts/6919/ssh_key_data (/runner/artifacts/6919/ssh_key_data)
Certificate added: /runner/artifacts/6919/ssh_key_data-cert.pub (vault-token-AWX-<ID>)

Now we only get the following:

Identity added: /runner/artifacts/6928/ssh_key_data (/runner/artifacts/6928/ssh_key_data)

AWX version

20.0.0

Select the relevant components

  • UI
  • API
  • Docs

Installation method

kubernetes

Modifications

no

Ansible version

No response

Operating system

No response

Web browser

No response

Steps to reproduce

  • Start a Job Template with a Credential of the Category HashiCorp Vault Signed SSH
  • Wait for the connection to the host

Expected results

Connection to the host should be established

Actual results

Error when trying to connect to a host:

Failed to connect to the host via ssh: user@10.10.0.1: Permission denied (publickey,gssapi-keyex,gssapi-with-mic)."

Additional information

No response

@nilsding
Copy link

nilsding commented Mar 4, 2022

I looked a bit into it, and to me looks like the shutil.rmtree(artifact_dir) introduced in #11472 inside awx/main/tasks/receptor.py is the culprit here... the cert file is still created inside the artifact dir, but it's deleted immediately afterwards. Removing that rmtree call makes the jobs pass again

@AlanCoding
Copy link
Member

Yes, I am already aware of problems introduced by that shutil.rmtree as @nilsding and I suspect it is the issue here.

That messes things up for node_type='hybrid' nodes. In AWX, all nodes should be node_type='control'. Anyway, we need a condition to not run that block of code if it's running locally. I had that change in some PR but it got mixed in with other work and tied up.

@al-lac
Copy link
Author

al-lac commented Mar 7, 2022

@AlanCoding good that this is known already. Hope we can get a fix into one of the next releases.

@AlanCoding
Copy link
Member

@al-lac the linked PR may fix this, if you want to test it, that could speed up getting it merged.

@john-westcott-iv I'm considering how we can add test coverage for certs, and we discussed testing for the existence of keys with ssh-add -l, I'm wondering if that might contain certificate information which would allow for some easy regression coverage of this issue.

@AlanCoding AlanCoding self-assigned this Mar 16, 2022
@thenets thenets closed this as completed Apr 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants