Add last written position as metadata in snapshot #10115

deepthidevaki · 2022-08-18T10:16:10Z

Description

For backup manager to identify if a snapshot can be included in the snapshot, a snapshot must include the position of the followup event included in the snapshot as part of its metadata. This event is the event whose source position = lastProcessedPosition in the snapshot. Since it is not possible to find actual "lastProcessedPosition" in the snapshot without opening the database, we can approximate it by taking "lastWrittenPosition" used by the AsyncSnapshotDirector to commit the snapshot. This value might be an over approximation, but it is safe to use it.

The text was updated successfully, but these errors were encountered:

deepthidevaki · 2022-08-18T14:41:03Z

Here is what I did to achieve this:

Added a PersistedSnapshot::getMetadata which returns a SnapshotMetadata
SnapshotMetadata is written to a file named ZEEBE.METADATA in the snapshot path.
SnapshotMetadata is serialized as json. This way we can also add new fields later without breaking compatibility.
The metadata file is inside the snapshot. So it is replicated as a normal "SnapshotChunk". This way the replication protocol remains unchanged => backward compatible. On the receiver side, it can recognize the metadata file from its name and build a SnapshotMetadata object by reading the file.
For backward compatibility, we handle the cases where metadata file is not available by building an incomplete SnapshotMetadata with known values.
SnapshotMetadata tentatively consists of following
- processedPosition (same as in snapshot id)
- exportedPosition (same as in snapshotid)
- lastFollowupEventPosition (provided by AsyncSnapshotDirector)
- version (we have a version in PersistedSnapshot, but it is not used and not persisted. By persisting it in metadata we can use it later for backwards compatibity)

Snapshot directory content with metadata will be as follows

> tree data/raft-partition/partitions/1/snapshots/
data/raft-partition/partitions/1/snapshots/
├── 19-1-39-19
│   ├── 000008.log
│   ├── 000009.sst
│   ├── CURRENT
│   ├── MANIFEST-000005
│   ├── OPTIONS-000007
│   └── ZEEBE.METADATA
└── 19-1-39-19.checksum

npepinpe · 2022-08-18T15:20:30Z

Looks sounds. The only point is the JSON serialization. It has nice properties (human readable, editable, etc.) but I'm not sure it solves any form of schema evolution. Sure, you can add fields, but you could do that with MsgPack, or Protobuf, or YAML, or even SBE.

As we aren't reading often from it, I think the benefits of human readability trumps performance, so I don't think SBE would be particularly beneficial here. But if we really want to tackle schema evolution, I don't think JSON alone is enough for that. If backwards/forwards compatibility is important, then we need to explicitly have some form of update test to guarantee it. Just my two cents.

deepthidevaki · 2022-08-18T15:50:21Z

Human readability was definitely the main motivation to chose json. Also I think handling backward/forward compatibility is easy as long as we are only adding new fields by setting ignore unknown properties = true. I think, we can also remove fields because it is possible to deserialize a partial json object. Agree that we need tests to ensure compatibility.

deepthidevaki added the kind/toil label Aug 18, 2022

deepthidevaki mentioned this issue Aug 18, 2022

Zeebe can backup its data to an external storage without downtime and restore from it #9606

Closed

58 tasks

deepthidevaki self-assigned this Aug 18, 2022

deepthidevaki mentioned this issue Aug 19, 2022

Persist snapshot metadata #10121

Merged

10 tasks

ghost closed this as completed in 976c14d Aug 22, 2022

deepthidevaki added the release/8.1.0-alpha5 label Sep 6, 2022

ChrisKujawa added the version:8.1.0 label Oct 4, 2022

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add last written position as metadata in snapshot #10115

Add last written position as metadata in snapshot #10115

deepthidevaki commented Aug 18, 2022

deepthidevaki commented Aug 18, 2022

npepinpe commented Aug 18, 2022

deepthidevaki commented Aug 18, 2022

Add last written position as metadata in snapshot #10115

Add last written position as metadata in snapshot #10115

Comments

deepthidevaki commented Aug 18, 2022

deepthidevaki commented Aug 18, 2022

npepinpe commented Aug 18, 2022

deepthidevaki commented Aug 18, 2022