Pagination #673

FoseFx · 2024-02-23T22:24:32Z

We kicked this can down the road for long enough now, but we need pagination.

References:

Approaches

There are, like always, multiple ways to approach this problem.

The straightforward way is...

Limit & Offset

Pages are just slices of a certain size of a query response in the database. SQL gives us the LIMIT and OFFSET keywords for exactly that.

This way, we simply add LIMIT ?N OFFSET (?N * ?P) to every query, where ?N is the page size and ?P the 0-indexed page we want to display.

With dynamic data, like we deal with, however, we cause a bad user-experience with this approach: Insertions and Deletions of Rows, will cause rows to be returned twice, or worse, not at all.

There is a (in my opinion negligible) possible performance issue, as well: In order to know which rows can be skipped, this means those rows are still queried, at least to some degree.

This is, however, easy to implement, to extend, and most-well understood by everyone reading the resulting code.

Keyset Pagination

Keyset Pagination exploits the fact, that in order (hah!) to order results, we specify a key which we order by. We then limit results, using a WHERE clause on said key.

SELECT ...
  FROM ...
 WHERE ...
   AND id < ?last_seen_id
 ORDER BY id DESC
 LIMIT ?N

This way, the in this case id, of the last seen element will become a "page token", that can be used to query the next page.

This requires ORDER BY key to be a total order. I guess a way out is to enforce a policy, in which we (additionally) order by id, if no total order can be guaranteed without it. As we use uuidv4s, this might be computationally expensive (maybe not, idk).

SELECT ...
  FROM ...
 WHERE ...
   AND age >= (SELECT age FROM table WHERE id = ?last_seen_id) -- we want to mainly order by age
   AND id < ?last_seen_id -- but people may have the same age
 ORDER BY age ASC, ORDER BY id DESC -- this is a linear order, as ids are unique
 LIMIT ?N

A small optimization for this is to remove the inner query, by encoding the key, in this case the tuple (last_age, last_id), for example using base64. This means we need to parse/unparse it. I'd like to defer this is in favor of code cleanliness.

Paginate all list responses?

The Google Cloud API Design Guide recommends that all listable collections should support pagination, no matter the expected size.

Rationale: If an API does not support pagination from the start, supporting it later is troublesome because adding pagination breaks the API's behavior. Clients that are unaware that the API now uses pagination could incorrectly assume that they received a complete result, when in fact they only received the first page.

I personally am onboard with that.

Is pagination information a part of the message or metadata?

The Google API Design Guide uses page_size and page_token values in the request message, and next_page_token in the response message.

I personally don't like this, as I believe these fields to be metadata. As such, they should be sent using headers / metadata. This way we keep the proto definitions clean. We might need to talk to frontend about what works best for them.

The text was updated successfully, but these errors were encountered:

FoseFx added the enhancement New feature or request label Feb 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pagination #673

Pagination #673

FoseFx commented Feb 23, 2024

Pagination #673

Pagination #673

Comments

FoseFx commented Feb 23, 2024

References:

Approaches

Limit & Offset

Keyset Pagination

Paginate all list responses?

Is pagination information a part of the message or metadata?