Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Fetching event reports can be slow #16619

Open
DMRobertson opened this issue Nov 9, 2023 · 1 comment
Open

Fetching event reports can be slow #16619

DMRobertson opened this issue Nov 9, 2023 · 1 comment
Labels
A-Admin-API A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db A-Moderation Tools for moderating HSes: event redaction, media removal, purge admin API, reports from users, ... A-Performance Performance, both client-facing and admin-facing O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.

Comments

@DMRobertson
Copy link
Contributor

E.g. from Jaeger:
image

image

My money is on LIMIT ... OFFSET ... being slow. I wonder if we could change this to paginate without OFFSET using the id column? (I assume that received_ts increases as id increases, and vice versa).

@DMRobertson DMRobertson added A-Performance Performance, both client-facing and admin-facing A-Admin-API A-Moderation Tools for moderating HSes: event redaction, media removal, purge admin API, reports from users, ... T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. labels Nov 9, 2023
@reivilibre
Copy link
Contributor

The API should probably be redesigned so that you can paginate on (received_ts, id) tuples (assuming there is an index on received_ts); the API should handback some sort of encoded (received_ts, id) token to the client and the client should pass that when trying to make a request next time.

As you say, LIMIT ? OFFSET ? is likely to blame here since it has to scan the index anyway.

I find myself somewhat surprised that we have so many reports that it takes 10 seconds to scan the index. Apparently we have 91k of them.

But there is no index on received_ts! A short-term fix would be to add an index on that.

We could also rewrite the query a bit so it doesn't perform a join before the LIMIT ? OFFSET ?: I believe that currently it will join to the events table even for the rows that aren't selected, because a LEFT JOIN might produce multiple rows (in theory, unless it's smart enough to know event_id is unique) and so it needs to check that. (probably similar logic for the room_stats_state...)

@reivilibre reivilibre added S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db O-Uncommon Most users are unlikely to come across this or unexpected workflow labels Nov 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Admin-API A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db A-Moderation Tools for moderating HSes: event redaction, media removal, purge admin API, reports from users, ... A-Performance Performance, both client-facing and admin-facing O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.
Projects
None yet
Development

No branches or pull requests

2 participants