Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eager open/close of a LazyFile for error handling can make fifo reading fail #2645

Open
sirosen opened this issue Dec 4, 2023 · 0 comments

Comments

@sirosen
Copy link
Contributor

sirosen commented Dec 4, 2023

Overview

When a click.File('rb', lazy=True) is used on Linux named pipes (fifos), the results are not guaranteed to be a simple "lazy read".
LazyFile opens and closes the file eagerly to check for errors (I presume permissions and existence, primarily). If too long passes between the open/close and the subsequent open for reading, the fifo contents may be emptied. If the period between the two open calls is short, something I don't quite understand happens and the read will succeed.

Reproducer

Here's a reproducer which waits 1 second before reading:

import time
import click

@click.command
@click.argument("file", type=click.File("rb", lazy=True))
def main(file):
    time.sleep(1)
    print(file)
    print(file.read())

if __name__ == "__main__":
    main()

Test this against a fifo on linux created and written with mkfifo foo; echo 'hi' > foo.
The echo+redirect will block, and will fail when the script is run with a broken pipe.
The click command will hang on file.read(), waiting for data to enter the fifo.

However, without the time.sleep(1) call, this example succeeds, leading to inconsistent behavior which can be hard to troubleshoot.
I thought maybe os has special behavior around non-GCed file descriptors, so I tried it with GC disabled, but it still failed. So I don't understand how it works when it works.

Expected behavior

Using a lazy file to read a fifo should work just as well as using a non-lazy file, and should not be subject to this timing-based behavior.

Suggested Fixes

There is no perfect solution here which retains the current behavior except when it would break.
However, given that the use of fifos is much less common than ordinary files, I see two fixes which I think would be acceptable, plus an alternative approach:

  1. Explicitly check if it's a fifo before doing the open/close (stat.S_ISFIFO(os.stat(filename).st_mode))
  2. Allow callers to control LazyFile and opt-out of the eager open/close
  3. Ensure it's easy to subclass LazyFile and customize, but otherwise do nothing

I tried out (1) and it seems to work. For (2), I don't see a great interface for this, other than maybe allowing parameters to LazyFile or a customization like click.File(lazy=CustomLazyFileClass).

I put (3) on the table because I appear to be the first person hitting this issue. I see no past issues and no SO questions or similar discussions. If this seems like it's "just too niche to worry about", I'm okay with that. In that case, I just want to be able to easily plug in my customizations.
Right now, I have a workaround which has to directly import from click._compat. It's not very long, but it's not as short as I would like.

long workaround
import typing as t
import click
from click._compat import open_stream

...

class _CustomLazyFile(click.utils.LazyFile):
    def __init__(
        self,
        filename: t.Union[str, "os.PathLike[str]"],
        mode: str = "r",
        encoding: t.Optional[str] = None,
        errors: t.Optional[str] = "strict",
        atomic: bool = False, 
    ):
        self.name: str = os.fspath(filename)
        self.mode = mode
        self.encoding = encoding
        self.errors = errors
        self.atomic = atomic
        self._f: t.Optional[t.IO[t.Any]]
        self.should_close: bool

        if self.name == "-":
            self._f, self.should_close = open_stream(filename, mode, encoding, errors)
        else:
            if "r" in mode and not stat.S_ISFIFO(os.stat(filename).st_mode):
                # Open and close the file in case we're opening it for
                # reading so that we can catch at least some errors in
                # some cases early.
                open(filename, mode).close()
            self._f = None
            self.should_close = True

class LazyBinaryReadFile(click.File):
    def convert(
        self,
        value: str,
        param: click.Parameter | None,
        ctx: click.Context | None,
    ) -> t.IO[bytes]:
        lf = _CustomLazyFile(value, mode="rb")
        if ctx is not None:
            ctx.call_on_close(lf.close_intelligently)
        return t.cast(t.IO[bytes], lf)

So a simpler path to customize that open(filename, mode).close() call, such that I don't need to replicate __init__, would be nifty.

I would be happy to open a PR with an improvement for this, and would most favor (1) as it makes things "just work" with the least fuss for most cases.

Environment

Environment:

  • Python version: 3.11
  • Click version: 8.1.7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant