Consider allowing unbuffered queries for ORM SelectQuery #17517

nicosp · 2024-01-02T13:31:18Z

Description

The main blocker we have to move to 5.x is the use of unbuffered queries. Our reports are created in the following way:

Run the "main" report on the "reports" connection which always has buffering disabled. This allows us to fetch any number of rows because they are not loaded in memory.
For each record we use the main connection which has buffering enabled to fetch any associations and format the entities.
Write the formatted entity in the report.
Move to the next result in the "reports" connection and repeat.

We could not find a replacement for 5.x. Note: We cannot simply use PDO for 1 because we rely on the ORM for access control and other facilities.

CakePHP Version

5.0

ADmad · 2024-01-09T07:24:29Z

You could use Connection::run() to run the query and get a StatementInterface instance which you can iterate over, or just use pagination to fetch reports in chunks.

nicosp · 2024-01-09T09:27:29Z

@ADmad The problem is that we rely on the cake ORM for some facilities and we would need to re-implement them on PDO. We cannot use pagination because it creates gaps or missing records because of record creation/deletions while the report is running.

othercorey · 2024-01-09T20:01:14Z

Why can't you buffer the results from the first query?

nicosp · 2024-01-11T14:34:59Z

Why can't you buffer the results from the first query?

Memory usage is too much to run the first query with buffering enabled (before we even get to the ORM). We create huge xml files from thousands of records. All workers are limited to 128M which is not enough for our bigger customers. So we have to use an unbuffered ORM query to fetch record by record consistently and use a "normal" buffered query to "decorate" each record with associations and related data.

othercorey · 2024-01-11T20:13:40Z

So you worker is batching jobs by pulling N records from this first query then building the rest of the data just for that set and then releasing the data before the next batch?

Is it not possible to just pull the records you need each batch?

nicosp · 2024-01-26T16:45:52Z

@othercorey Let me explain it better:

The batch worker is fetching ALL records in a single ORM unbuffered query from the "report" connection. We cannot use any variant of fetch N records (paging etc) because we have new inserts while this long running query is executing. It's critical that we don't skip any records because of this. It's not just theoretical it has happened. We cannot use a buffered query here because the number of records is very high and our rows are quite large.
For each record we are using buffered ORM queries from the default connection to fetch associations etc.
We write the "full" record data in our XML feed and move to the next record in the "report" connection query.

Hope this makes it clear.

othercorey · 2024-03-21T00:49:35Z

FYI - we are considering this, but don't have an immediate solution.

nicosp added the enhancement label Jan 2, 2024

othercorey added the ORM label Jan 2, 2024

othercorey added this to the 5.1.0 milestone Jan 2, 2024

othercorey added the database label Jan 2, 2024

ADmad mentioned this issue Apr 1, 2024

5.1 unbuffered queries #17649

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider allowing unbuffered queries for ORM SelectQuery #17517

Consider allowing unbuffered queries for ORM SelectQuery #17517

nicosp commented Jan 2, 2024 •

edited

ADmad commented Jan 9, 2024

nicosp commented Jan 9, 2024

othercorey commented Jan 9, 2024

nicosp commented Jan 11, 2024

othercorey commented Jan 11, 2024

nicosp commented Jan 26, 2024

othercorey commented Mar 21, 2024

Consider allowing unbuffered queries for ORM SelectQuery #17517

Consider allowing unbuffered queries for ORM SelectQuery #17517

Comments

nicosp commented Jan 2, 2024 • edited

Description

CakePHP Version

ADmad commented Jan 9, 2024

nicosp commented Jan 9, 2024

othercorey commented Jan 9, 2024

nicosp commented Jan 11, 2024

othercorey commented Jan 11, 2024

nicosp commented Jan 26, 2024

othercorey commented Mar 21, 2024

nicosp commented Jan 2, 2024 •

edited