Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git.execute output_stream is not documented to suppress kill_after_timeout, but does #1762

Open
EliahKagan opened this issue Dec 10, 2023 · 5 comments

Comments

@EliahKagan
Copy link
Contributor

When calling Git.execute, including indirectly through the dynamic methods of a Git object, passing a non-None value of output_stream suppresses the effect of kill_after_timeout. The threading.Timer object is created with the local kill_process function as its callback, but start is never called on it.

GitPython/git/cmd.py

Lines 1050 to 1079 in a58a6be

if output_stream is None:
if kill_after_timeout is not None:
watchdog.start()
stdout_value, stderr_value = proc.communicate()
if kill_after_timeout is not None:
watchdog.cancel()
if kill_check.is_set():
stderr_value = 'Timeout: the command "%s" did not complete in %d ' "secs." % (
" ".join(redacted_command),
kill_after_timeout,
)
if not universal_newlines:
stderr_value = stderr_value.encode(defenc)
# Strip trailing "\n".
if stdout_value.endswith(newline) and strip_newline_in_stdout: # type: ignore
stdout_value = stdout_value[:-1]
if stderr_value.endswith(newline): # type: ignore
stderr_value = stderr_value[:-1]
status = proc.returncode
else:
max_chunk_size = max_chunk_size if max_chunk_size and max_chunk_size > 0 else io.DEFAULT_BUFFER_SIZE
stream_copy(proc.stdout, output_stream, max_chunk_size)
stdout_value = proc.stdout.read()
stderr_value = proc.stderr.read()
# Strip trailing "\n".
if stderr_value.endswith(newline): # type: ignore
stderr_value = stderr_value[:-1]
status = proc.wait()
# END stdout handling

This situation appears not entirely intentional, because:

So either kill_after_timeout should be honored even when output_stream is used, or the docstring should be updated to mention that it does not (and the code possibly modified to not create the Timer object in the case that it is unused).

I am not sure which change should be made, so I'm opening this rather than a PR to propose one of them.

This is effectively separate from #1756. (One quality this shares with #1756 is that it is relevant only to the behavior of Git.execute and functions that use it. Functions that use handle_process_output are unaffected; its kill_after_timeout is implemented differently with nonequivalent behavior and separate code. handle_process_output also does not take an output_stream argument.)

@Byron
Copy link
Member

Byron commented Dec 10, 2023

Thanks for bringing this to my attention - I definitely wasn't aware.

Generally, if the user obtains a running process, then they are responsible for handling it no auto-killing should happen. Thus I think the current behaviour, also in relation to output_stream should be documented. The code could also be cleaned up so that the timer isn't created in the first place in that case.

Maybe we can consider to emit a warning if incompatible flags are specified to make people aware that their kill_after_timeout setting isn't effective.

Maybe doing so then triggers new issues of users arguing for making this work, which is probably when it could be made to work.

I am not quite sure what the wisest way of dealing with this is, as either change could be considered breaking, and documenting anything could make a 'breaking fix' impossible if the documentation endorses the status quo.

I definitely lean to doing that though, but appreciate your advise and/or opinion on what to do here.

@EliahKagan
Copy link
Contributor Author

EliahKagan commented Dec 10, 2023

Generally, if the user obtains a running process, then they are responsible for handling it no auto-killing should happen.

Although passing as_process=True causes a Git.AutoInterrupt object to be returned without waiting for the subprocess to exit, passing output_stream does not affect this. Both branches of the if-else for the condition output_stream is None wait for the subprocess to complete and set the status variable to its exit status (and this if-else is in a try-block whose finally block calls close on the process's stdout and stderr streams). The code that runs for a non-None value of output_stream is:

GitPython/git/cmd.py

Lines 1075 to 1082 in 4a9922f

max_chunk_size = max_chunk_size if max_chunk_size and max_chunk_size > 0 else io.DEFAULT_BUFFER_SIZE
stream_copy(proc.stdout, output_stream, max_chunk_size)
stdout_value = proc.stdout.read()
stderr_value = proc.stderr.read()
# Strip trailing "\n".
if stderr_value.endswith(newline): # type: ignore
stderr_value = stderr_value[:-1]
status = proc.wait()

When output_stream is None, the process is waited on by:

stdout_value, stderr_value = proc.communicate()

When output_stream is not None, the process is waited on by:

status = proc.wait()

Regarding whether that behavior should be considered correct, I think the docstring is intended to indicate specifically that behavior, but I am unsure:

GitPython/git/cmd.py

Lines 864 to 871 in 4a9922f

:param output_stream:
If set to a file-like object, data produced by the git command will be
output to the given stream directly.
This feature only has any effect if `as_process` is False. Processes will
always be created with a pipe due to issues with subprocess.
This merely is a workaround as data will be copied from the
output pipe to the given output stream directly.
Judging from the implementation, you shouldn't use this parameter!

The reason I think it is intended to specify the current behavior is that, if the output were instead read in chunks and copied, allowing it to be used while the subprocess is still running, then that wouldn't generally be a reason not to use it.

Either way, I think it would be a breaking change to make Git.execute return something altogether different when output_stream is used, since all existing uses of output_stream would be broken by such a change. Whether or not output_stream continues to suppress kill_after_timeout, perhaps it should be made more explicit in the Git.execute docstring that output_stream is blocking.

I think this doesn't necessarily mean kill_after_timeout shouldn't be honored when output_stream is used. But if the deciding reason for whether to honor kill_after_timeout or not is whether the subprocess completes before Git.execute returns, then this would be a reason to make the opposite decision.

The code could also be cleaned up so that the timer isn't created in the first place in that case.

We can actually avoid creating the timer in any cases, because Popen.communicate supports an optional timeout argument. This was introduced in Python 3.3, so it was not available at the time that (most) of this code was written. But since GitPython supports only Python 3.7 and higher (and no versions of Python 2) today, it could be used. This would make the code shorter and simpler. It would also eliminate what either is, or looks like, a possible race condition on the direct subprocess's PID between the "main" thread and the timer thread. (This race condition, if it exists, is very minor, and is independent of the indirect-subprocess PID race condition discussed in #1756.)

However, if this is to be done, then the idea of waiting for bigger design changes to add tests for Git.execute's kill_after_timeout (which I have understood #1756 (comment) to recommend) should possibly be revisited. Although a simplification, I think that, without tests that could fail due to a regression, this would carry a greater risk of breakage than #1761 did.

With that said, if you're comfortable having such a change without an accompanying increase in test coverage, I'm fully willing to open a PR for it without changes to the tests.

@Byron
Copy link
Member

Byron commented Dec 11, 2023

Whether or not output_stream continues to suppress kill_after_timeout, perhaps it should be made more explicit in the Git.execute docstring that output_stream is blocking.

I think this doesn't necessarily mean kill_after_timeout shouldn't be honored when output_stream is used. But if the deciding reason for whether to honor kill_after_timeout or not is whether the subprocess completes before Git.execute returns, then this would be a reason to make the opposite decision.

I see, thanks so much for taking the time to clarify with code-references. Admittedly I don't usually check the code if it's linked, and embedding it definitely assures I see it.

In conjunction with the opportunity of using the new timeout parameter that is supported from python 3.3, along with realising that the output_stream parameter indeed waits for the copy to be completed, it seems like kill_after_timeout could be applied even if output_stream is set. And if so, then the new built-in timeout capabilities should be used.

Generally, if the user obtains a running process, then they are responsible for handling it no auto-killing should happen.

Although passing as_process=True causes a Git.AutoInterrupt object to be returned without waiting for the subprocess to exit, passing output_stream does not affect this. Both branches of the if-else for the condition output_stream is None wait for the subprocess to complete and set the status variable to its exit status (and this if-else is in a try-block whose finally block calls close on the process's stdout and stderr streams). The code that runs for a non-None value of output_stream is:

GitPython/git/cmd.py

Lines 1075 to 1082 in 4a9922f

max_chunk_size = max_chunk_size if max_chunk_size and max_chunk_size > 0 else io.DEFAULT_BUFFER_SIZE
stream_copy(proc.stdout, output_stream, max_chunk_size)
stdout_value = proc.stdout.read()
stderr_value = proc.stderr.read()
# Strip trailing "\n".
if stderr_value.endswith(newline): # type: ignore
stderr_value = stderr_value[:-1]
status = proc.wait()

When output_stream is None, the process is waited on by:

stdout_value, stderr_value = proc.communicate()

When output_stream is not None, the process is waited on by:

status = proc.wait()

Regarding whether that behavior should be considered correct, I think the docstring is intended to indicate specifically that behavior, but I am unsure:

GitPython/git/cmd.py

Lines 864 to 871 in 4a9922f

:param output_stream:
If set to a file-like object, data produced by the git command will be
output to the given stream directly.
This feature only has any effect if `as_process` is False. Processes will
always be created with a pipe due to issues with subprocess.
This merely is a workaround as data will be copied from the
output pipe to the given output stream directly.
Judging from the implementation, you shouldn't use this parameter!

The reason I think it is intended to specify the current behavior is that, if the output were instead read in chunks and copied, allowing it to be used while the subprocess is still running, then that wouldn't generally be a reason not to use it.

Either way, I think it would be a breaking change to make Git.execute return something altogether different when output_stream is used, since all existing uses of output_stream would be broken by such a change. Whether or not output_stream continues to suppress kill_after_timeout, perhaps it should be made more explicit in the Git.execute docstring that output_stream is blocking.

I think this doesn't necessarily mean kill_after_timeout shouldn't be honored when output_stream is used. But if the deciding reason for whether to honor kill_after_timeout or not is whether the subprocess completes before Git.execute returns, then this would be a reason to make the opposite decision.

The code could also be cleaned up so that the timer isn't created in the first place in that case.

We can actually avoid creating the timer in any cases, because Popen.communicate supports an optional timeout argument. This was introduced in Python 3.3, so it was not available at the time that (most) of this code was written. But since GitPython supports only Python 3.7 and higher (and no versions of Python 2) today, it could be used. This would make the code shorter and simpler. It would also eliminate what either is, or looks like, a possible race condition on the direct subprocess's PID between the "main" thread and the timer thread. (This race condition, if it exists, is very minor, and is independent of the indirect-subprocess PID race condition discussed in #1756.)

However, if this is to be done, then the idea of waiting for bigger design changes to add tests for Git.execute's kill_after_timeout (which I have understood #1756 (comment) to recommend) should possibly be revisited. Although a simplification, I think that, without tests that could fail due to a regression, this would carry a greater risk of breakage than #1761 did.

With that said, if you're comfortable having such a change without an accompanying increase in test coverage, I'm fully willing to open a PR for it without changes to the tests.

Of course it would be great if the test-coverage could be improved when adding kill_after_timeout support for output_stream to cover that case, and maybe it's easy enough to also add test coverage for the existing kill behaviour so that the timer code could be removed with certainty. It's probably less about making it work, but more to assure the code handles possible changes in exceptions or return values correctly, so flying blind might indeed be problematic.

If you think there is enough coverage to make the change (maybe regardless some missing coverage), a PR would definitely be appreciated. It seems like the maintainability of the code would be improved with such change.

PS: When seeing this it seems that stderr could start blocking if the pipe is filled while stdout is read, causing a deadlock particularly if there is no timeout. Am I missing something?

GitPython/git/cmd.py

Lines 1076 to 1078 in 4a9922f

stream_copy(proc.stdout, output_stream, max_chunk_size)
stdout_value = proc.stdout.read()
stderr_value = proc.stderr.read()

@EliahKagan
Copy link
Contributor Author

Of course it would be great if the test-coverage could be improved when adding kill_after_timeout support for output_stream to cover that case, and maybe it's easy enough to also add test coverage for the existing kill behaviour so that the timer code could be removed with certainty. It's probably less about making it work, but more to assure the code handles possible changes in exceptions or return values correctly, so flying blind might indeed be problematic.

If you think there is enough coverage to make the change (maybe regardless some missing coverage), a PR would definitely be appreciated. It seems like the maintainability of the code would be improved with such change.

That looks feasible. I'll try and look at it sometime soon. There are a few other GitPython-related things I've ended up being in the middle of at the same time (as each other), so I plan to finish--and/or or discover the unworkability of and abandon--a couple of those first, before circling back to try and make a PR for this. (But in saying this, it is not my intention to claim the issue. If, in the mean time, someone else comes along and fixes this, I certainly would not object.)

PS: When seeing this it seems that stderr could start blocking if the pipe is filled while stdout is read, causing a deadlock particularly if there is no timeout. Am I missing something?

GitPython/git/cmd.py

Lines 1076 to 1078 in 4a9922f

stream_copy(proc.stdout, output_stream, max_chunk_size)
stdout_value = proc.stdout.read()
stderr_value = proc.stderr.read()

I'm not sure, but it does look like that could be a problem. I suspect that may not be natural o fix together with the above stuff, though. It might be better as its own issue. (I might be able to look into that and open one.) Or perhaps it will turn out easy to fix along with the rest of this.

@Byron
Copy link
Member

Byron commented Dec 12, 2023

I'm not sure, but it does look like that could be a problem. I suspect that may not be natural o fix together with the above stuff, though. It might be better as its own issue. (I might be able to look into that and open one.) Or perhaps it will turn out easy to fix along with the rest of this.

I am definitely glad that you are aware for a chance of a fix.

What it tries to do is to stream stdout, but collect stderr. Maybe the implementation of Popen::communicate() can be adjusted to make this work.

Here is how the Rust implementation does it, similar to communicate(), to collect the output of to pipes at once without blocking, maybe it can be helpful in some way.

pub fn read2(p1: AnonPipe, v1: &mut Vec<u8>, p2: AnonPipe, v2: &mut Vec<u8>) -> io::Result<()> {
    // Set both pipes into nonblocking mode as we're gonna be reading from both
    // in the `select` loop below, and we wouldn't want one to block the other!
    let p1 = p1.into_inner();
    let p2 = p2.into_inner();
    p1.set_nonblocking(true)?;
    p2.set_nonblocking(true)?;

    let mut fds: [libc::pollfd; 2] = unsafe { mem::zeroed() };
    fds[0].fd = p1.as_raw_fd();
    fds[0].events = libc::POLLIN;
    fds[1].fd = p2.as_raw_fd();
    fds[1].events = libc::POLLIN;
    loop {
        // wait for either pipe to become readable using `poll`
        cvt_r(|| unsafe { libc::poll(fds.as_mut_ptr(), 2, -1) })?;

        if fds[0].revents != 0 && read(&p1, v1)? {
            p2.set_nonblocking(false)?;
            return p2.read_to_end(v2).map(drop);
        }
        if fds[1].revents != 0 && read(&p2, v2)? {
            p1.set_nonblocking(false)?;
            return p1.read_to_end(v1).map(drop);
        }
    }

    // Read as much as we can from each pipe, ignoring EWOULDBLOCK or
    // EAGAIN. If we hit EOF, then this will happen because the underlying
    // reader will return Ok(0), in which case we'll see `Ok` ourselves. In
    // this case we flip the other fd back into blocking mode and read
    // whatever's leftover on that file descriptor.
    fn read(fd: &FileDesc, dst: &mut Vec<u8>) -> Result<bool, io::Error> {
        match fd.read_to_end(dst) {
            Ok(_) => Ok(true),
            Err(e) => {
                if e.raw_os_error() == Some(libc::EWOULDBLOCK)
                    || e.raw_os_error() == Some(libc::EAGAIN)
                {
                    Ok(false)
                } else {
                    Err(e)
                }
            }
        }
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants