Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pagination #673

Open
FoseFx opened this issue Feb 23, 2024 · 0 comments
Open

Pagination #673

FoseFx opened this issue Feb 23, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@FoseFx
Copy link
Member

FoseFx commented Feb 23, 2024

We kicked this can down the road for long enough now, but we need pagination.

References:

Approaches

There are, like always, multiple ways to approach this problem.

The straightforward way is...

Limit & Offset

Pages are just slices of a certain size of a query response in the database. SQL gives us the LIMIT and OFFSET keywords for exactly that.

This way, we simply add LIMIT ?N OFFSET (?N * ?P) to every query, where ?N is the page size and ?P the 0-indexed page we want to display.

With dynamic data, like we deal with, however, we cause a bad user-experience with this approach: Insertions and Deletions of Rows, will cause rows to be returned twice, or worse, not at all.

There is a (in my opinion negligible) possible performance issue, as well: In order to know which rows can be skipped, this means those rows are still queried, at least to some degree.

This is, however, easy to implement, to extend, and most-well understood by everyone reading the resulting code.

Keyset Pagination

Keyset Pagination exploits the fact, that in order (hah!) to order results, we specify a key which we order by. We then limit results, using a WHERE clause on said key.

SELECT ...
  FROM ...
 WHERE ...
   AND id < ?last_seen_id
 ORDER BY id DESC
 LIMIT ?N

This way, the in this case id, of the last seen element will become a "page token", that can be used to query the next page.

This requires ORDER BY key to be a total order. I guess a way out is to enforce a policy, in which we (additionally) order by id, if no total order can be guaranteed without it. As we use uuidv4s, this might be computationally expensive (maybe not, idk).

SELECT ...
  FROM ...
 WHERE ...
   AND age >= (SELECT age FROM table WHERE id = ?last_seen_id) -- we want to mainly order by age
   AND id < ?last_seen_id -- but people may have the same age
 ORDER BY age ASC, ORDER BY id DESC -- this is a linear order, as ids are unique
 LIMIT ?N

A small optimization for this is to remove the inner query, by encoding the key, in this case the tuple (last_age, last_id), for example using base64. This means we need to parse/unparse it. I'd like to defer this is in favor of code cleanliness.


Paginate all list responses?

The Google Cloud API Design Guide recommends that all listable collections should support pagination, no matter the expected size.

Rationale: If an API does not support pagination from the start, supporting it later is troublesome because adding pagination breaks the API's behavior. Clients that are unaware that the API now uses pagination could incorrectly assume that they received a complete result, when in fact they only received the first page.

I personally am onboard with that.

Is pagination information a part of the message or metadata?

The Google API Design Guide uses page_size and page_token values in the request message, and next_page_token in the response message.

I personally don't like this, as I believe these fields to be metadata. As such, they should be sent using headers / metadata. This way we keep the proto definitions clean. We might need to talk to frontend about what works best for them.

@FoseFx FoseFx added the enhancement New feature or request label Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant