Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Late/optional decoding? #1112

Open
ale-dd opened this issue Jan 2, 2024 · 3 comments
Open

Late/optional decoding? #1112

ale-dd opened this issue Jan 2, 2024 · 3 comments

Comments

@ale-dd
Copy link

ale-dd commented Jan 2, 2024

I have a few interrelated questions. Some might even be blasphemous for lack of in-depth knowledge of the wire protocol on my part.

  1. Is there a way to delay decoding results until actually looking at data and interpreting it would become necessary?
  2. Similarly, would it be possible to take undecoded "records" (not in the strict field-to-value map sense of asyncpg records) and pass them as-is to an upsert without having to re-encode them, so long as the bound arguments correspond to those returned by the previous query?
  3. Would that potentially avoid the need to issue type introspection queries?
  4. If not as of today, would this become easier once @vmarkovtsev's asyncpg-rkt fork gets merged back upstream?
  5. Otherwise, at the very least, would it be possible to decode timestamps to something closer to their internal postgres representation - for example mapping them to the Python's decimal type? I personally find that python's datetime is terrible and Postgres timestamps can fall outside of the range that can be represented via datetime. Should such a decimal codec come with asyncpg, or should everyone write their own if and only if they think they need it?

Thank you & happy new year!

@elprans
Copy link
Member

elprans commented Jan 2, 2024

Is there a way to delay decoding results until actually looking at data and interpreting it would become necessary?

Lazy decoding is something that can theoretically be implemented, and we even considered it at some point, but ultimately this doesn't seem to be very useful given that query output is usually consumed in its entirety anyway. Do you have a concrete use case in mind?

pass them as-is to an upsert without having to re-encode them, so long as the bound arguments correspond to those returned by the previous query?

It seems like you would be better served by using a temporary table or a CTE instead.

Would that potentially avoid the need to issue type introspection queries?

You need introspection queries to be able to decode data (lazily or not, does not matter).

If not as of today, would this become easier once @vmarkovtsev's asyncpg-rkt fork gets merged back upstream?

I'm not very familiar with what asyncpg-rkt is doing. Possibly?

Otherwise, at the very least, would it be possible to decode timestamps to something closer to their internal postgres representation - for example mapping them to the Python's decimal type?

Absolutely, use set_type_codec and the tuple format, e.g:

    def decoder(ts_tuple):
        microseconds_since_pg_epoch = ts_tuple[0]
        ...

    def encoder(microseconds_since_pg_epoch):
        return (microseconds_since_pg_epoch,)

    await con.set_type_codec(
        'timestamptz',
        schema='pg_catalog',
        decoder=decoder,
        encoder=encoder,
        format='tuple',
    )

@ale-dd
Copy link
Author

ale-dd commented Jan 3, 2024

It seems like you would be better served by using a temporary table or a CTE instead.

that would only work on the very same database server... unfortunately that is not my use case. I'm shoveling data from a source DB to a destination DB, oftentimes without foreign data wrappers support...

I'm not very familiar with what asyncpg-rkt is doing. Possibly?

it's described here:

https://betterprogramming.pub/i-forked-asyncpg-and-it-parses-database-records-to-numpy-20x-faster-e71024a84bff

the author expressed his interest in merging it back to upstream asyncpg here:

#17 (comment)

@ale-dd
Copy link
Author

ale-dd commented Jan 31, 2024

@elprans would there be any appetite for that fork to be merged back?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants