Mount the stdout callback plugin for containerized runs #957

AlanCoding · 2022-01-13T04:16:18Z

This is a continuation on from #763

I really want to keep @shanemcd's simplification which eliminates the aliasing to the minimal callback name. That PR was getting hung up on the fact that CI doesn't seem to build the new image correctly. This highlights a pain point that comes from the fact that ansible-runner used from the CLI employs the callback plugin from a different install of ansible-runner, baked into the image. Here, I'm investigating an option to address this pain point where I volume mount the stdout callback plugin, so we assure that any changes to the plugin will be instantly picked up without rebuilding. This might be overly-optimistic, but I'm hoping that this will be compatible with the old image, so even tests will pass. So far I've found some race conditions locally (probably due to multiprocessing), and depending on what tests give me here, my next step might be to make a per-run copy of the plugin.

So in short, instead of backing out some of our simplifications to get that other PR to merge, here, I'm entertaining the idea of leaning into even more simplifications that might hopefully solve several problems.

AlanCoding · 2022-01-14T13:47:44Z

ansible_runner/config/_base.py

+        self._update_volume_mount_paths(new_args, "{}".format(self.private_data_dir), dst_mount_path="/runner", labels=":Z")
+
+        # Mount the stdout callback plugin from the ansible-runner code base
+        self._update_volume_mount_paths(new_args, "{}".format(get_plugin_dir()), dst_mount_path="/home/runner/.ansible/plugins", labels=":Z")


My biggest open question for other people is - where should the mount be? Options include:

/home/runner/.ansible/plugins ~~(this)~~

/home/runner/.ansible/plugins/callback (changed to this)

/home/runner/.ansible/plugins/callback/awx_display.py (single file mount)

/usr/share/ansible/plugins/callback

/usr/share/ansible/plugins/callback/awx_display.py (single file mount)

/usr/share/ansible/collections/ansible_collections/<collection>/<name>/plugins/callback/awx_display.py

The current option doesn't seem to have any benefits compared to the two options after it. After all, consolidating it into a single file was so that I could do the single file mount. I haven't made this change yet because I wanted to do a bit of a survey of preferences first.

Related to this, but still separate, is the question of where it should go within the code base. In this PR, I nested it in the "callback" folder, but to be able to collection-ify this in the future, I see the wisdom in nesting it in "plugins/callback" folders

I seem to have ruled out the single file mount options for now. Doing this results in permission errors like:

ERROR! Unexpected Exception, this is probably a bug: [Errno 13] Permission denied: '/home/runner/.ansible/plugins/callback/awx_display.py'

Maybe this could be fixed with changes to the Dockerfile, but doing so would partially defeat the point.

With my latest push, the mount location is at /home/runner/.ansible/plugins/callback.

thanks for detailing your thought process. I like /home/runner/.ansible/plugins/callback as the result

This patch contains actions suggested in review comments at ansible#763 plus some more In display_callback/module.py, avoid modifying os.environ Remove parent class AWXDefaultCallbackModule, because it makes less since when there is not a job vs. adhoc class split As a consequence, only 1 parent callback class is needed the class which is the default, in absense of awx_display Move modifications of self.env from the runner module to the config module to be most consistent with the rest of the code base

AlanCoding · 2022-01-19T19:18:23Z

Additional questions I have for the team:

Mentioned in other comment, is the plugins folder as the container mount point okay? Is /home/runner/.ansible preferable, or is /usr/share?
About the host machine location, would this be a good final location: ansible_runner/display_callback/callback/awx_display.py?
Assuming (2), would there be interest in trying to see how running ansible-test goes, recognizing that it wouldn't ever be vendored, just used for CI?

Hook in stdout callback volume mount Share pdd and plugin mounts with the NONE base execution mode Remove PYTHONPATH assertions no longer relevant Attempt to patch up docs Refinement of docs page for the refactored display callback

This changes the location of the mount for the stdout callback prior value was at /home/runner/.ansible/plugins new mount location is at /home/runner/.ansible/plugins/callback This is more targeted and more desirable for this reason. We cannot mount single-files like awx_display.py, due to permissions error, which may be because changes are needed in the Dockerfile or some other reason To carry this out, new utils method callback_mount is introduced to make such changes easier to carry out in the future

eqrx · 2022-01-20T14:56:30Z

Additional questions I have for the team:

Mentioned in other comment, is the plugins folder as the container mount point okay? Is /home/runner/.ansible preferable, or is /usr/share?

I personally would /home/runner/.ansible since I would argue that it is more in line with other tools I have seen but objectively I would not mind either paths.

About the host machine location, would this be a good final location: ansible_runner/display_callback/callback/awx_display.py?

Yes!

Assuming (2), would there be interest in trying to see how running ansible-test goes, recognizing that it wouldn't ever be vendored, just used for CI?

I would be interested, can I help?

AlanCoding · 2022-01-21T21:07:08Z

@eqrx I think I messed up a copy+paste in my prior comment, I had wanted to ask if ansible_runner/display_callback/plugins/callback/awx_display.py would be a preferable location.

The logic I'm coming from there is that, in a Makefile target, I could template out a galaxy.yml file in the display_callback folder and run some ansible-test sanity checks. This isn't to package it as a collection (formally), it's just to get some more checks on our radar. I've been meaning to tinker with that (as I still don't know how close it would be to passing), but I'll give it a shot on Monday and see if has any feedback of value.

AlanCoding · 2022-01-24T20:02:10Z

I started that experiment, investigating ansible-test sanity as potentially a new type of CI. I got two kinds of feedback from it.

ERROR: Output on stderr from ansible-doc is considered an error.

Command "ansible-doc -t callback ansible_runner.display_callback.awx_display" returned exit status 0.
>>> Standard Error
[WARNING]: Failed to assign id for <ast.Attribute object at 0x7faea916a920> on 
/home/alancoding/.ansible/collections/ansible_collections/ansible_runner/displa
y_callback/plugins/callback/awx_display.py, skipping

and

ERROR: Found 3 pylint issue(s) which need to be resolved:
ERROR: plugins/callback/awx_display.py:109:19: ansible-format-automatic-specification: Format string contains automatic field numbering specification
ERROR: plugins/callback/awx_display.py:231:32: ansible-format-automatic-specification: Format string contains automatic field numbering specification
ERROR: plugins/callback/awx_display.py:239:23: ansible-format-automatic-specification: Format string contains automatic field numbering specification

The pylint failure is pretty mundane and not challenging to comply with.

The ast error looks fairly concerning, but after following it, I'm less worried than I thought I would be. It caches both instances we assign Display.verbose to something. The code comment in Ansible suggests this is a fairly normal situation.

In summary, this method didn't discover a whole bunch of important and actionable points, and I'm not as motivated to pursue it anymore.

Shrews

This looks ok to me, though I can't really test how this works under AWX. As far as mount point location, I would lean toward under $HOME. Since this is continuation of work from @shanemcd, I'd like to get his approval before merging.

shanemcd · 2022-01-25T21:19:31Z

ansible_runner/config/_base.py

+        self._update_volume_mount_paths(new_args, "{}".format(self.private_data_dir), dst_mount_path="/runner", labels=":Z")
+
+        # Mount the stdout callback plugin from the ansible-runner code base
+        self._update_volume_mount_paths(new_args, callback_mount()[0], dst_mount_path=callback_mount()[1], labels=":Z")


I think this might break if you install runner as root in the system site packages and then run as a non-root user, because you won't be able to relabel a file you dont own.

Maybe try :O instead of :Z?

yeah, this sounds more right to me, but I haven't been able to figure out how to artificially create the error to validate it fixes it. I got it to these permissions:

bash-4.4$ ls -la /var/lib/awx/venv/awx/lib/python3.8/site-packages/ansible_runner/display_callback/callback/ total 44 drwxr-xr-x. 3 root root 4096 Jan 26 03:18 . drwxr-xr-x. 3 root root 4096 Jan 26 03:18 .. -rw-r--r--. 1 root root 30788 Jan 26 03:18 awx_display.py -rw-r--r--. 1 root root 0 Jan 26 03:18 __init__.py drwxr-xr-x. 2 root root 4096 Jan 26 03:18 __pycache__

Then as the awx user, it still works. Is there any tweak I can do which should make this break?

wait, never mind, I was running the wrong thing. I am able to get an error with this type of setup. Right now it just gives the vague message

ERROR! Invalid callback for stdout specified: awx_display

But I'm still tinkering with the install by root user.

Testing the :0 in one environment, I get

Error: mounting overlay failed "/var/lib/awx/venv/awx/lib64/python3.8/site-packages/ansible_runner/display_callback/callback/": chown /var/lib/awx/.local/share/containers/storage/overlay-containers/78dce4e01f5d25bbfe804246e232815dd809cb5360237b1fddf35bc49bfe2fca/userdata/overlay/568435025/upper: invalid argument

I'm not sure how big of a problem this is going to be. In another environment it works. Reading related issues, people mention that kernel version needs to be relatively recent for a metacopy setting. Depending on what we support, I could see this getting complicated.

Okay, trying on my own machine, I was able to reproduce something by doing only this:

sudo chown -R root:root ansible_runner/display_callback/

Resulting in

$ ansible-runner run demo/ -p test.yml --process-isolation Error: lsetxattr /home/alancoding/repos/ansible-runner/ansible_runner/display_callback/callback: operation not permitted

So maybe I am figuring this out. Here, specifically, with root ownership and -rw-rw-r--. permissions, the display callback is not editable for the current user. Like you say, that shouldn't be a problem. I fear that :O will not work in all the contexts that we need it to... but it wouldn't be atypical for runner to create a new tmpdir to make something work. We could do this with a copy of the display callback. We know it should only be a few files.

Having a replicator, I developed a patch that fixes it.

I did not use the :O, because I'm worried this won't work in all cases. Instead, I copied the files (possible if you only have read access) to a tmpdir, and mounted the tmpdir.

Tomorrow I can work on a test case for this logic.

AlanCoding · 2022-01-28T03:21:55Z

I've put up a patch and a test for @shanemcd's concern about cases where ansible-runner is installed by the root user but used by a non-root user. Practically, I am kind of limited with what I can do with tests, because my actual reproducer was to do sudo chown -R root:root ansible_runner/display_callback/, which we do not want to do in tests themselves. This just gets code coverage to assure this new special case doesn't break.

ansible-zuul

LGTM!

@shanemcd

…ner version (#988) Add compatibility code to the entrypoint so new EEs work with old runner version In #957 we combined 2 callback names (awx_display, minimal) into just 1, awx_display. Fundamentally, this makes it hard for the containers to work with ansible-runner versions before/after, and vice versa. Ping review by @shanemcd In that PR, I paid attention to the ability for current ansible-runner to work with old images. I was failing to fully appreciate the implications of old ansible-runner not working with current images. This PR adds a shim to the entrypoint so that the image with work even if used by old versions of ansible-runner. In the PR 957 I removed the use of the LAUNCHED_BY_RUNNER environment variable. This allows using that as a poor proxy for the version of ansible-runner that is starting the container. @rebeccahhh pointed out to me that our tests for AAP controller 4.1.x were failing due to these problems. I've re-tested those with a custom image built from this branch, quay.io/alancoding/ansible-runner:marty_youve_got_to_come_back_with_me, and this gets them to pass. There's no reasonable way to ever get test coverage, because the whole idea is compatibility with old versions. So I am working on running make test with that custom image tagged as quay.io/ansible/ansible-runner:devel while I have the commit f1b6c3a checked out, which was the last commit before the PR landed. Reviewed-by: Shane McDonald <me@shanemcd.com> Reviewed-by: Rebeccah Hunter <rhunter@redhat.com> Reviewed-by: None <None>

github-actions bot added docs Changes to documentation needs_triage New item that needs to be triaged labels Jan 13, 2022

AlanCoding force-pushed the mount_callback branch from df1bc03 to 11530a8 Compare January 13, 2022 19:01

AlanCoding commented Jan 14, 2022

View reviewed changes

AlanCoding changed the title ~~[WIP] Mount the stdout callback plugin for containerized runs~~ Mount the stdout callback plugin for containerized runs Jan 14, 2022

This was referenced Jan 14, 2022

Implement stdout callback and its logic as a collection #482

Closed

Dev tooling - Updates to ansible-runner propagate through the dev env ansible/awx#11323

Closed

AlanCoding force-pushed the mount_callback branch from c4fd693 to 11292eb Compare January 17, 2022 16:14

shanemcd and others added 10 commits January 18, 2022 16:14

Support custom stdout callbacks

b01fca4

Refactor things down to a single stdout callback module

6b83c6b

Switch test to stdout callback in ansible.builtin

837e83e

Use correct metric for IS_ADHOC determination

f098e00

Accomidate other config classes

49026da

Remove test hack

d7eb76a

Code move for single-file display callback

70be623

Remove noqa comments with config change

ee15e0b

Move display callback to required name

dd8349f

AlanCoding force-pushed the mount_callback branch from 11292eb to 9bd9a4d Compare January 18, 2022 21:14

AlanCoding marked this pull request as ready for review January 19, 2022 15:26

AlanCoding requested a review from a team as a code owner January 19, 2022 15:26

AlanCoding added 2 commits January 19, 2022 14:20

Code and test changes for callback volume mount

55d716c

Hook in stdout callback volume mount Share pdd and plugin mounts with the NONE base execution mode Remove PYTHONPATH assertions no longer relevant Attempt to patch up docs Refinement of docs page for the refactored display callback

AlanCoding force-pushed the mount_callback branch from 9bd9a4d to 851f4d3 Compare January 19, 2022 19:21

eqrx approved these changes Jan 20, 2022

View reviewed changes

AlanCoding mentioned this pull request Jan 20, 2022

Improve event processing timings by eliminating -partial.json event data file writes ansible/awx#11557

Open

AlanCoding mentioned this pull request Jan 21, 2022

Support custom stdout callbacks #763

Closed

Shrews approved these changes Jan 24, 2022

View reviewed changes

Shrews removed the needs_triage New item that needs to be triaged label Jan 24, 2022

shanemcd reviewed Jan 25, 2022

View reviewed changes

AlanCoding added 2 commits January 26, 2022 22:32

Workaround when install is not owned by current user

df44cea

Add test to simulate non-root container run

3c3a461

Shrews added the gate label Feb 1, 2022

ansible-zuul bot approved these changes Feb 1, 2022

View reviewed changes

shanemcd approved these changes Feb 1, 2022

View reviewed changes

ansible-zuul bot merged commit 8813003 into ansible:devel Feb 1, 2022

AlanCoding mentioned this pull request Feb 4, 2022

Add compatibility code to the entrypoint so new EEs work with old runner version #988

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mount the stdout callback plugin for containerized runs #957

Mount the stdout callback plugin for containerized runs #957

AlanCoding commented Jan 13, 2022

AlanCoding Jan 14, 2022 •

edited

AlanCoding Jan 17, 2022

eqrx Jan 20, 2022

AlanCoding commented Jan 19, 2022

eqrx commented Jan 20, 2022

AlanCoding commented Jan 21, 2022

AlanCoding commented Jan 24, 2022

Shrews left a comment

shanemcd Jan 25, 2022

shanemcd Jan 25, 2022

AlanCoding Jan 26, 2022

AlanCoding Jan 26, 2022

AlanCoding Jan 26, 2022

AlanCoding Jan 26, 2022

AlanCoding Jan 27, 2022

AlanCoding commented Jan 28, 2022

ansible-zuul bot left a comment

Mount the stdout callback plugin for containerized runs #957

Mount the stdout callback plugin for containerized runs #957

Conversation

AlanCoding commented Jan 13, 2022

AlanCoding Jan 14, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlanCoding commented Jan 19, 2022

eqrx commented Jan 20, 2022

AlanCoding commented Jan 21, 2022

AlanCoding commented Jan 24, 2022

Shrews left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlanCoding commented Jan 28, 2022

ansible-zuul bot left a comment

Choose a reason for hiding this comment

AlanCoding Jan 14, 2022 •

edited