Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea: combine parse, bind and execute in one packet if no prepared statement is used for extended protocol query methods #1017

Open
fritshoogland-yugabyte opened this issue Apr 5, 2023 · 11 comments

Comments

@fritshoogland-yugabyte
Copy link

I investigated the packets that the rust postgres driver sends, and when using the extended protocol, a query executed when using an extended protocol execution method first sends a packet for the parse, and then combines the bind and execution in the next packet.

This is understandable for when using a prepared statement, however when the used query is not a prepared statement, this should be possible to reduce in a singe call? I just looked at the java packets, and JDBC seems to perform this in one packet.

This can mean a significant reduction of overhead for small queries where client response time is mostly composited of network times of flight, and potentially reduces CPU.

I have investigated the query method for as extended protocol execution and extended protocol with prepared statement, and the simple_query method for simple query protocol.

@sfackler
Copy link
Owner

sfackler commented Apr 5, 2023

We need to know how Postgres parsed the statement to construct the bind packet.

If you're concerned about overhead of small queries I would recommend preparing them up front which will be faster than preparing them each time either way.

@fritshoogland-yugabyte
Copy link
Author

Thank you for your response. I had the feeling there was some reason for it.
Can you expand on that a bit?
Why can java do this without requiring PostgreSQL to have parsed it, and be able to send out parse, bind and execute in a single go?

@sfackler
Copy link
Owner

sfackler commented Apr 5, 2023

IIRC the Java client string-interpolates all of its query parameters or something.

@fritshoogland-yugabyte
Copy link
Author

I fully agree: that for small queries, preparing these upfront will be faster than preparing them each time. And the documentation suggests for queries that are reused to create a prepared statement.

However, generally, many statements are small and are used once (along with a lot of statements are reused over and over), and quite often these do not have any placeholders and variables to bind, wouldn't it for these be possible in the first place and and if so an improvement to push out the extended query phases of parse, bind and execute in one go?

Or is that simply not possible in the current form of the postgres driver, because it still needs the parse result for the bind packet?

Of course it's possible using the rust postgres driver to simply use the query_simple impl, but if possible for simple SQL reducing the number of packets is likely to make a huge difference with bigger deployments.

@sfackler
Copy link
Owner

sfackler commented Apr 5, 2023

Without writing a client-side query parser (which I am not interested in doing) we don't have any reliable way to determine if the query has parameters or not.

@fritshoogland-yugabyte
Copy link
Author

Check. That is clear.

@fritshoogland-yugabyte
Copy link
Author

@sfackler is there a likewise reason every extended query protocol request uses a named statement? Using a named statement the minimal number of roundtrips for a query is 3: parse, bind/execute, close. If no named statement for non-prepared statements is used, it can be reduced by one, because it doesn't need close?

@sfackler
Copy link
Owner

I think that would be possible with a more sophisticated implementation of ToStatement, yeah.

@fritshoogland-yugabyte
Copy link
Author

Is there anything you want me to do about that, such as creating a separate issue?

When no prepared statement is used, it makes sense to use an unnamed statement. The reduction of roundtrips from 2 to 3 is likely significant especially in clouds where the client might not be physically close to the database. Obviously the close for database side prepared statements as a result of a statement being prepared client side makes sense, which is the current implementation.

@sfackler
Copy link
Owner

Probably worth making a separate issue, yeah.

One complicating factor is that use of the unnamed statement will break when pipelining queries. When that happens, you can end up with "prepare 1, prepare 2, execute 1, execute 2", so the execute 1 step would end up using query 2's statement. That can probably be worked around with some extra state tracking to prevent overwrites of the unnamed statement but I don't know what it would look like off the top of my head.

@sfackler
Copy link
Owner

Keep in mind as well that statements aren't closed synchronously - instead it just registers the fact that it needs to be closed in the future, and that'll be folded into the next request if there is one pending.

ramnivas added a commit to exograph/rust-postgres that referenced this issue Jun 4, 2024
Introduce a new `query_with_param_types` method that allows to specify Postgres type parameters. This obviated the need to use prepared statementsjust to obtain parameter types for a query. It then combines parse, bind, and execute in a single packet.

Related: sfackler#1017, sfackler#1067
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants