You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are, like always, multiple ways to approach this problem.
The straightforward way is...
Limit & Offset
Pages are just slices of a certain size of a query response in the database. SQL gives us the LIMIT and OFFSET keywords for exactly that.
This way, we simply add LIMIT ?N OFFSET (?N * ?P) to every query, where ?N is the page size and ?P the 0-indexed page we want to display.
With dynamic data, like we deal with, however, we cause a bad user-experience with this approach: Insertions and Deletions of Rows, will cause rows to be returned twice, or worse, not at all.
There is a (in my opinion negligible) possible performance issue, as well: In order to know which rows can be skipped, this means those rows are still queried, at least to some degree.
This is, however, easy to implement, to extend, and most-well understood by everyone reading the resulting code.
Keyset Pagination
Keyset Pagination exploits the fact, that in order (hah!) to order results, we specify a key which we order by. We then limit results, using a WHERE clause on said key.
SELECT ...
FROM ...
WHERE ...
AND id < ?last_seen_id
ORDER BY id DESCLIMIT ?N
This way, the in this case id, of the last seen element will become a "page token", that can be used to query the next page.
This requiresORDER BY key to be a total order. I guess a way out is to enforce a policy, in which we (additionally) order by id, if no total order can be guaranteed without it. As we use uuidv4s, this might be computationally expensive (maybe not, idk).
SELECT ...
FROM ...
WHERE ...
AND age >= (SELECT age FROM table WHERE id = ?last_seen_id) -- we want to mainly order by ageAND id < ?last_seen_id -- but people may have the same ageORDER BY age ASC, ORDER BY id DESC-- this is a linear order, as ids are uniqueLIMIT ?N
A small optimization for this is to remove the inner query, by encoding the key, in this case the tuple (last_age, last_id), for example using base64. This means we need to parse/unparse it. I'd like to defer this is in favor of code cleanliness.
Paginate all list responses?
The Google Cloud API Design Guide recommends that all listable collections should support pagination, no matter the expected size.
Rationale: If an API does not support pagination from the start, supporting it later is troublesome because adding pagination breaks the API's behavior. Clients that are unaware that the API now uses pagination could incorrectly assume that they received a complete result, when in fact they only received the first page.
I personally am onboard with that.
Is pagination information a part of the message or metadata?
The Google API Design Guide uses page_size and page_token values in the request message, and next_page_token in the response message.
I personally don't like this, as I believe these fields to be metadata. As such, they should be sent using headers / metadata. This way we keep the proto definitions clean. We might need to talk to frontend about what works best for them.
The text was updated successfully, but these errors were encountered:
We kicked this can down the road for long enough now, but we need pagination.
References:
Approaches
There are, like always, multiple ways to approach this problem.
The straightforward way is...
Limit & Offset
Pages are just slices of a certain size of a query response in the database. SQL gives us the
LIMIT
andOFFSET
keywords for exactly that.This way, we simply add
LIMIT ?N OFFSET (?N * ?P)
to every query, where?N
is the page size and?P
the 0-indexed page we want to display.With dynamic data, like we deal with, however, we cause a bad user-experience with this approach: Insertions and Deletions of Rows, will cause rows to be returned twice, or worse, not at all.
There is a (in my opinion negligible) possible performance issue, as well: In order to know which rows can be skipped, this means those rows are still queried, at least to some degree.
This is, however, easy to implement, to extend, and most-well understood by everyone reading the resulting code.
Keyset Pagination
Keyset Pagination exploits the fact, that in order (hah!) to order results, we specify a key which we
order by
. We then limit results, using a WHERE clause on said key.This way, the in this case
id
, of the last seen element will become a "page token", that can be used to query the next page.This requires
ORDER BY key
to be a total order. I guess a way out is to enforce a policy, in which we (additionally) order by id, if no total order can be guaranteed without it. As we use uuidv4s, this might be computationally expensive (maybe not, idk).A small optimization for this is to remove the inner query, by encoding the key, in this case the tuple
(last_age, last_id)
, for example using base64. This means we need to parse/unparse it. I'd like to defer this is in favor of code cleanliness.Paginate all list responses?
The Google Cloud API Design Guide recommends that all listable collections should support pagination, no matter the expected size.
I personally am onboard with that.
Is pagination information a part of the message or metadata?
The Google API Design Guide uses
page_size
andpage_token
values in the request message, andnext_page_token
in the response message.I personally don't like this, as I believe these fields to be metadata. As such, they should be sent using headers / metadata. This way we keep the proto definitions clean. We might need to talk to frontend about what works best for them.
The text was updated successfully, but these errors were encountered: