Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make Cursor.stream() work in pipeline mode #409

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

dlax
Copy link
Contributor

@dlax dlax commented Oct 10, 2022

This adds support for the pipeline for Cursor.stream(), aka single-row mode.

Due to how .stream() is implemented, it somehow interrupts the pipeline as it forces results fetch alongside query send so that's a bit suboptimal. But at least, people using the pipeline mode would not need to exit/re-enter it to use .stream().

There is a problem in the libpq when using single-row mode and pipeline mode together; this has been reported and fixed, but is worked around on our side.

Closes #406

We add a first test similar to test_singlerowmode() from Postgres test
suite (src/test/modules/libpq_pipeline/libpq_pipeline.c).

The second one, XFAIL, as it demonstrates a bug in Postgres.
When sending the query from stream() in pipeline mode, we now use the
new pipeline_send() generator to also account for any command previously
in the pipeline. Results fetch is done through _stream_fetchone_gen()
method of Pipeline, which also possibly accounts for results of prior
commands. The single-row query is identified in the results queue by an
extra bool item in the tuple of PendingResult.

As can be seen in test_cursor_stream_query_fetch_bug(), there is a
problem when the pipeline contains a normal query after a streamed one,
apparently due to a bug in the libpq.

Cursor.stream() somehow interrupts the pipeline as it forces results
fetch for the single-row query, and thus for any query in the pipeline
queue (emitted before .stream()).
We enqueue an extra Sync() message after sending the streamed query;
this will reset the single-row mode flag, which is not done presently on
libpq side.
@dlax dlax marked this pull request as ready for review October 12, 2022 09:47
@dvarrazzo
Copy link
Member

Hi @dlax,

are we sure that the work done here is valuable? It seems to me that streaming and batching go in two different directions, and there are other features that cannot be supported in pipeline mode (e.g. COPY).

Is there a concrete gain in adding this extra complexity and branching in the codebase?

@dlax
Copy link
Contributor Author

dlax commented Oct 12, 2022

Hi @dvarrazzo,

That's a good question and I understand the concern about possible extra maintenance burden.

I do remember that we stated that pipeline and streaming modes were not supposed to be used together in the initial implementation. But then I saw that there was code handling this and documentation in the libpq ,so I thought it could be worth.

Currently, it's a limitation we impose on our side (or an oversight), in contrast with COPY.

In fact, I'm not sure these two modes go in "different directions"; namely the pipeline mode helps when network latency matters, and single-row mode would help when, e.g., memory consumption on the client matters. I would suppose someone might be interested in sending a batch of queries, but receiving the results of some of them row-by-row.

Now, as mentioned above, the implementation proposed here does not decouple query sending from results fetch in .stream(), so this use case cannot be realized as is (unless the streaming query is the last one in the pipeline). But that's a limitation I'd like to remove in the future (if we decide to go further with this idea).

I also remember you raised the "silly" idea that we could make the connection always in pipelining, and I was thinking about the blockers for this. Maybe that's theoretical, maybe not?

@dlax
Copy link
Contributor Author

dlax commented Jan 24, 2023

It's probably worth doing some refactoring before proceeding with this feature so as to limit branching in the code base.

Marking as draft accordingly now; will come back later to this.

@dlax dlax marked this pull request as draft January 24, 2023 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

single-row and pipeline modes
2 participants