Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Bail during start-up if there are old background updates. #16397

Open
wants to merge 8 commits into
base: develop
Choose a base branch
from

Conversation

clokep
Copy link
Contributor

@clokep clokep commented Sep 28, 2023

This is a basic implementation to enforce that some old background updates have run.

My hope is that this could be used to avoid future situations like #16192. This would fix #16047 in a simplistic manner and doesn't attempt to differentiate between different types of background updates.

This still would require us to do thinking about what old updates are now required, but I think it would fix the issue where we have to continually create background jobs due to dependencies.

TODO

  • Enforce that ordering is filled in (non-nullable).
  • Maybe try to fix that BACKGROUND_UPDATES_COMPAT_VERSION < last full schema dump?
  • Documentation.

@clokep
Copy link
Contributor Author

clokep commented Sep 28, 2023

Putting up for review as I'd be interested in feedback on this. Is this approach worth pursuing?

@erikjohnston
Copy link
Member

One thought I had was that we need to think a bit about how this interacts with security releases, i.e. if we put out a high severity fix would this potentially block large servers from upgrading (if they were still churning through a large BG update)?

@DMRobertson
Copy link
Contributor

Similarly: suppose we write a bad background update that takes aaaaaaaaaaages to run. Months later we discover this, after we've bumped the schema version. Do we have the ability to patch the background update itself?

(See e.g. the rebuild user directory background update, for example)

@clokep
Copy link
Contributor Author

clokep commented Sep 29, 2023

I think the response to both those is the same -- you want the minimum background update version to trail ages behind your current release. So it really becomes a process thing.

You could still have an issue where a deploy of Synapse is also very far behind and then needs to jump due to security releases though, I don't have a good answer to that.

Similarly: suppose we write a bad background update that takes aaaaaaaaaaages to run. Months later we discover this, after we've bumped the schema version. Do we have the ability to patch the background update itself?

I think the (annoying) solution you do is a point release before you bump the required version. People can then update to that version with the improved performance.

@clokep
Copy link
Contributor Author

clokep commented Oct 2, 2023

Enforce that ordering is filled in (non-nullable).

ordering is NOT NULL, but has a default of 0. 🤷

Comment on lines +137 to +140
BACKGROUND_UPDATES_COMPAT_VERSION = (
# The replace_stream_ordering_column from 6001 must have run.
61
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking on it more, I wonder if this should not be just background updates, but also refuse to start if your database is older than this? Effectively making a minimum supported version to upgrade from?

Comment on lines +138 to +139
# The replace_stream_ordering_column from 6001 must have run.
61
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably needs upgrade notes to let folks know to double check this has ran before upgrading?

Comment on lines +50 to +52
On startup, if there exists any background update (via the
`background_updates.ordering` column) older than `BACKGROUND_UPDATES_COMPAT_VERSION`,
Synpase will refuse to start.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annoyingly many updates (especially older ones) are inserted without an ordering, which defaults to 0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might need to somehow enforce that new updates at least have a non-zero ordering, if we're taking this path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might need to somehow enforce that new updates at least have a non-zero ordering, if we're taking this path.

I agree -- I'm unsure if that's a process thing or a database change. I guess we can update the column to not have a default so people at least need to provide one?

Comment on lines +54 to +56
This is useful for adding delta files which assume background updates have
finished; overall maintenance of Synapse (by allowing for removal of code
supporting old background updates); among other things.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the realization today that this also lets us delete the handlers for older background updates, which convinces me more this is a reasonable approach.

@clokep
Copy link
Contributor Author

clokep commented Oct 16, 2023

I think this is ready for an initial review for real.

@clokep clokep marked this pull request as ready for review October 16, 2023 20:11
Comment on lines +32 to +36
* The Synapse codebase defines a constant `synapse.storage.schema.SCHEMA_COMPAT_VERSION`
which represents the minimum database versions the current code supports.
Whenever the Synapse code is updated to assume backwards-incompatible changes
made to the database format, `synapse.storage.schema.SCHEMA_COMPAT_VERSION` is also updated
so that administrators can not accidentally roll back to a too-old version of Synapse.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that defining this before db.schema_compat_version is a bit backwards.

# ordering is an int based on when the background update was added:
#
# (schema version when added * 100) + (schema delta when added).
update_schema_version = ordering // 100
Copy link
Member

@richvdh richvdh Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned that the fact that the ordering is (schema version when added * 100) + N is a bit of a loose, unenforced, convention.

In many ways I'd find it less confusing if, instead of using a COMPAT_VERSION following the same system as SCHEMA_VERSION etc, we instead had a MIN_PENDING_BACKGROUND_UPDATE_ORDER which we compared with the raw ordering of the pending updates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is definitely a bit loose, using the raw minimum might make more sense.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Updated Synapse versions should fail to start if required background updates have not run yet.
4 participants