Optimize query conversion in C #456

dvarrazzo · 2022-12-13T04:52:59Z

Profiling the current state of Psycopg (3.1.5), in a tight loop, profiling looks like:

The ragged hill is the machinery to convert parameters from Python to Postgres. It is probably the best area to attack in order to improve performances, probably even more valuable, and less intrusive, than #415).

Rewriting this part in Cython would also allow to communicate in C between query and Transformer, so probably there is no need to resurface to Python in a PostgresQuery.convert() call (except to call Python dumpers of course).

psycopg/psycopg/psycopg/cursor.py

Lines 469 to 470 in 1edf0eb

    
           pgq = PostgresQuery(self._tx) 
        
           pgq.convert(query, params)

The text was updated successfully, but these errors were encountered:

jmallad · 2022-12-14T21:17:30Z

Hey Daniele, I'm Josh Mallad. Josh Drake from Command Prompt sent me here - he is potentially willing to sponsor me for this work. I'm trying to wrap my head around this codebase, and I have some questions. Is there a relevant psycopg development community you would prefer to direct me to (IRC channel, mailing list, etc.) or should I contact you directly?

dvarrazzo · 2022-12-15T12:10:00Z

Hello Josh, nice to meet you 🙂

Large part of the development happens on this tracker. You are welcome to ask questions on this issue, if it's relevant to the task, or to open a discussion if it's something more generic.

If you don't feel comfortable asking on a public channel, feel free to write to me directly. You can find my email in the git commits.

Good luck!

jmallad · 2022-12-15T17:55:00Z

How can I reproduce the results you are seeing with your profiling? What test code are you using? I guess the first thing I want to do is try profiling different types to see which ones are hot spots. My guess would be strings and datetime objects, but I can't know for sure yet.

dvarrazzo · 2022-12-16T00:39:52Z

A graph like the one above can be created using py-spy.

I have used the command line:

py-spy record --native -r 500 -o myout.svg -- \
    python tests/scripts/bench-411.py psycopg --ntests 50000 --no-drop --no-create \
    --dsn="host=localhost dbname=psycopg3_test sslmode=disable"

the script runs a script with a minimal tight loop (the name comes from issue #411) - just a micro-benchmark - and saves the result as a flame graph in myout.svg. In the command line above:

--native allows to profile both C and Python in the same trace
-r 500 is the sampling rate. More would probably not add more resolution - the profile would just trip over itself.
--no-create and --no-drop refer to the test data created by bench-411.py. The first time you run the script you can't use --no-create. If you use --no-drop, then you can use --no-create and --no-drop to leave the test data in the database so that creation/deletion don't affect the benchmark.
In the connection string, I have used host=localhost because I wanted to test waiting on the network, not of Unix sockets, and sslmode=disable to avoid to see the encryption time in the benchmark.

And yes, of course different data types would take more or less for conversions. All the data types loaders and dumpers have a Python implementation (in psycopg/psycopg/types), some of them have a faster equivalent in C (in psycopg_c/psycopg_c/types). The basic ones (strings, numbers, datetime) are implemented already. Others are added progressively: faster arrays loaders were added in 3.1.5 (see #359). Others are probably a good thing to add, such as uuid (see #447), others are probably not worth the hassle (network types, composites: not used so often to be relevant).

jmallad · 2022-12-16T12:31:50Z

Awesome, thanks so much. I think I've got everything I need to start hacking on this now.

jmallad · 2022-12-27T14:42:46Z

Getting back into a normal rhythm now that the holidays are over. I'm currently working on converting _queries.py to Cython - I will have something to push for review before the week is over.

jmallad · 2023-01-09T17:46:52Z

I've got the new _queries.pyx building now. You can see it here. I'm now trying to find spots in the annotated HTML where it can be further optimized. I've just seen issue #446 - is this a potentially worthwhile path to explore for this issue? I notice it works with the type annotations, many of which I had to remove to get Cython to compile.

dvarrazzo · 2023-01-10T10:42:30Z

Thank you very much, that's good to hear! I'll take a look.

Using mypyc, as per #446, might get some speedup, but, as far as I understand, mypyc is not designed to interface with C code.

The algorithms used to parse the queries should benefit of having their inner part written in c using c structures rather than python lists and dicts, so I'd keep them in cython with an eye of pushing them to pure c.

The parts of the adapter showing up in profiling after the query manipulation are the check of whether the procedure should be stored and setting the results to the cursor. They are comparatively less meaningful to optimise, and they are at close contact with python objects, so there's little that can be made into pure c I think and less to gain in writing handcrafted Cython code

jmallad · 2023-01-16T13:25:43Z

Hey, can someone let me know if I'm on the right track here?

My brain is fried from figuring out Cython, but I think I am starting to get it. _query2pg has a lot of list usage, so I thought I'd start by implementing a fast C list structure. This way, I can replace the slower Python lists with faster C structures.

This isn't so straightforward though; particularly when you get to exceptions. My C list calls malloc() a few times, allocating resources which must later be freed. But, Python could throw an exception in the middle of the C code, messing everything up.

One idea I have is containing the C list inside of a Python object, so it can have a destructor (del) method. Does this sound like it would work?

dvarrazzo · 2023-01-16T14:20:00Z

Hello! Thank you for working on it! I will review the code later today.

dvarrazzo · 2023-01-17T02:10:08Z

@jmallad I have started taking a look at your code. However, can you make a pull request, so I can share a review? Thank you!

jmallad · 2023-01-17T03:10:43Z

#485

dvarrazzo added the performance label Dec 13, 2022

dvarrazzo mentioned this issue Jan 4, 2023

Performance degradation under concurrency #448

Closed

jmallad mentioned this issue Jan 17, 2023

Issue 456 - partial code for review #485

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize query conversion in C #456

Optimize query conversion in C #456

dvarrazzo commented Dec 13, 2022

jmallad commented Dec 14, 2022

dvarrazzo commented Dec 15, 2022

jmallad commented Dec 15, 2022

dvarrazzo commented Dec 16, 2022

jmallad commented Dec 16, 2022

jmallad commented Dec 27, 2022

jmallad commented Jan 9, 2023

dvarrazzo commented Jan 10, 2023

jmallad commented Jan 16, 2023

dvarrazzo commented Jan 16, 2023

dvarrazzo commented Jan 17, 2023

jmallad commented Jan 17, 2023

Optimize query conversion in C #456

Optimize query conversion in C #456

Comments

dvarrazzo commented Dec 13, 2022

jmallad commented Dec 14, 2022

dvarrazzo commented Dec 15, 2022

jmallad commented Dec 15, 2022

dvarrazzo commented Dec 16, 2022

jmallad commented Dec 16, 2022

jmallad commented Dec 27, 2022

jmallad commented Jan 9, 2023

dvarrazzo commented Jan 10, 2023

jmallad commented Jan 16, 2023

dvarrazzo commented Jan 16, 2023

dvarrazzo commented Jan 17, 2023

jmallad commented Jan 17, 2023