Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove any pipe files before zip/unzipping #1133

Merged
merged 4 commits into from Oct 28, 2022

Conversation

fosterseth
Copy link
Member

@fosterseth fosterseth commented Sep 16, 2022

We are seeing cases where unstream_dir() is hanging when attempting to read ssh_key_data. This file is supposed to be cleaned up by this point, but maybe not if things go awry.

This change just looks for files that are type "pipe" and removes them.

To test I changed the "rm" command here to "ls", so the pipe is never removed.

ssh_key_cleanup_command = 'rm -f {}'.format(ssh_key_path)

https://github.com/ansible/ansible-runner/blob/devel/ansible_runner/config/_base.py#L612

Project updates can still succeed, no hangs.

@fosterseth fosterseth requested a review from a team as a code owner September 16, 2022 21:29
Copy link
Member

@nitzmahone nitzmahone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, other than deleting any socket file encountered in the send tree seems a little draconian (rather than just skipping it).

if stat.S_ISFIFO(os.stat(full_path).st_mode):
# must remove any pipes before reading
# i.e. ssh_key_data that was never cleaned up
os.remove(full_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we need to actually delete the fifo instead of just skipping it for the zip? Neither the stream nor unstream directories are guaranteed to be exclusively "ours", so who knows what might be laying around in there that something else is relying on...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed so that it skips instead of removes. We still remove in the unstream_dir though.

if stat.S_ISFIFO(os.stat(out_path).st_mode):
# must remove any pipes before reading
# i.e. ssh_key_data that was never cleaned up
os.remove(out_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one makes more sense to delete as a safety, esp since it will only trigger when the archive contains a file with the same name/path- seems a lot less likely to stomp an arbitrary socket file laying around that might be in use...

@fosterseth fosterseth force-pushed the fix_ssh_key_data_pipe_reading branch 2 times, most recently from 4f4fa04 to d21629d Compare September 18, 2022 21:19
@@ -22,6 +22,11 @@ def stream_dir(source_directory, stream):
relpath = ""
for fname in files + dirs:
full_path = os.path.join(dirpath, fname)
if stat.S_ISFIFO(os.stat(full_path).st_mode):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a test for this change?

@AlanCoding
Copy link
Member

I am a little worried about performance of running python stat on every file in a large repo, and it might be good to characterize the time cost of doing that compared to the total cost of extracting the zip data.

@fosterseth
Copy link
Member Author

to-do: move logic below the symlink checks, otherwise we get a file not found error

@AlanCoding
Copy link
Member

https://gist.github.com/AlanCoding/90708be38abab46d594ee18a729e86f2

Initial data:

Took 0.021636962890625 to crawl all of .
Took 0.059778451919555664 to crawl all of . while checking for symlinks
That includes 10133 objects
Took 0.07620024681091309 to crawl all of . with stat
Took 2.5405492782592773 to actually stream to in-memory buffer

Idea here is that this PR adds the "stat" computation. The symlink check (in addition to being necessary for this work) is pre-existing. The added time from the stat comes to 0.01642

Compared to the total streaming time, that's like 0.01642/2.5405492782592773, coming to about 0.6% of total time. This will be higher for repos with more small files, and smaller for repos with fewer large files.

@AlanCoding
Copy link
Member

fosterseth#1

Here are 2 proposed tests for the cases being fixed here. Obviously, I confirmed that they fail in devel, those are the failures I give in that link.

I have not confirmed that your work here makes them pass, but obviously, if the fix works, it should.

- Prevents hangs when zipping, unzipping artifacts dir. If python
attempts to open a fifo pipe, it will block indefinitely until it can
read data from the pipe.
Add tests that fail in devel on timeout due to pipe hang
@fosterseth fosterseth requested review from Akasurde and nitzmahone and removed request for Akasurde and nitzmahone October 28, 2022 16:43
@AlanCoding
Copy link
Member

Looks good, we did integration testing with AWX as well and that came back looking fine.

@nitzmahone nitzmahone merged commit 8f39752 into ansible:devel Oct 28, 2022
nitzmahone pushed a commit to nitzmahone/ansible-runner that referenced this pull request Oct 28, 2022
* Write test to demonstrate job hang on pipes

* Remove any pipe files before zip/unzipping

- Prevents hangs when zipping, unzipping artifacts dir. If python
attempts to open a fifo pipe, it will block indefinitely until it can
read data from the pipe.

* move fifo check below symlink

Co-authored-by: Alan Rominger <arominge@redhat.com>
(cherry picked from commit 8f39752)
Shrews pushed a commit that referenced this pull request Nov 1, 2022
* Write test to demonstrate job hang on pipes

* Remove any pipe files before zip/unzipping

- Prevents hangs when zipping, unzipping artifacts dir. If python
attempts to open a fifo pipe, it will block indefinitely until it can
read data from the pipe.

* move fifo check below symlink

Co-authored-by: Alan Rominger <arominge@redhat.com>
(cherry picked from commit 8f39752)

Co-authored-by: Seth Foster <fosterseth@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants