Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leak of durable objects through old virtual and heap references after upgrade #9338

Open
mhofman opened this issue May 8, 2024 · 0 comments
Labels
enhancement New feature or request liveslots requires vat-upgrade to deploy changes

Comments

@mhofman
Copy link
Member

mhofman commented May 8, 2024

What is the Problem Being Solved?

A durable object can be reachable by being directly exported, or by a local reference from a durable object, virtual-only object or heap object.

During an upgrade, all virtual-only and heap references are severed. While we track export vs heap vs "virtual/durable" (the 3 "pillars"), we do not currently differentiate virtual-only from durable in the latter. We also do not have a way to enumerate the objects which only have the virtual-only or heap pillar, that was severed by the upgrade. As such these "previous incarnation only" references cause the accumulation of garbage when upgrades occur.

The problem described here is related to information maintained by the vat, and unknown to the kernel. It's different from issues like #7212 which describe cases where the kernel realizes some exported references are no longer reachable after an upgrade.

Even before the removal of stopVat (#6650 closed by #7244) we've been unable to collect some of these cases because we lacked the information necessary to find these "incarnation only references".

Description of the Design

Track virtual-only

Currently each virtual/durable object maintains a vom.rc.${baseRef} entry in the vatStore which contains the total ref count from other vrefs. This forms the "virtual pillar". We need to extend this to differentiate the virtual-only from durable case. We also need to understand if the virtual-only count is from the current or a previous incarnation.

We can do this in a backwards compatible manner by transforming the field vom.rc. into the following: ${totalCount},${virtualOnlyCount},${virtualOnlyIncarnationNum}. When touching the field, the VOM can find out that the virtual-only count comes from an old incarnation, and zero it after substracting it from the total count. For a virtual-only objects, the totalCount and the virtualOnlyCount would always be the same.

Track no "upgradable reference" object set

Tracking virtual-only and heap references does not help identifying which virtual/durable objects are no longer referenced after an upgrade. For that we could introduce a durable set recording the vrefs which do not have a durable or export pillar, let's call it "incarnation only references set". Any time an object drops either its export pillar (aka simplified get(vom.es.${baseRef}) !== 'r') or its durable pillar (totalCount - virtualOnlyCount === 0), we would check the other pillar, and if both are gone, the vref of the object would be added to the incarnationOnlyRefSet. Exporting the object or referencing it from a durable object would remove the object from the set.

When the totalCount drops to zero, the object can be added to the possiblyDeadSet as only the heap pillar would possibly remain. Effectively this incarnationOnlyRefSet set becomes a precursor state to being in the possiblyDeadSet, and the isVirtualObjectReachable could be updated to check this single "incarnation only set".

To support slow deletion (#8417), this "incarnation only set" could in fact be a map which records if there are remaining virtual-only references totalCount !== 0, and possibly in which incarnation these virtual-only references existed. Then the slow deletion logic could add the vrefs to possiblyDeadSet in stages. It would also avoid a leak through the possiblyDeadSet which is heap only.

Tracking virtual-only definitions

This part is a little less fleshed out, but durably tracking virtual/durable instances is not sufficient to clean up all garbage. The definitions of virtual-only kinds create a vom.vkind entry each time. Once all objects of that vkind have been deleted from the vatStore, we should be safe to delete this entry as well. I'm not sure how to track this, or if it's worth the trouble.

Security Considerations

None I can think of right now

Scaling Considerations

Adding a large amount of objects to this "incarnation only set" still causes a lot of syscalls (at least while vatStore operations require syscalls). For the case of collection deletion, the "Rate-Limited Collection Deletion" section of #8417 describes a list of collections which need to be slow deleted, and we could potentially merge it with the "incarnation only set" (e.g. when encountering a non-empty collection in the set, remove N entries and place them in the set).

Test Plan

TBD

Upgrade Considerations

TLDR: liveslots must leave enough information behind so that successors can clean up incarnation specific data.

One thing this approach doesn't help as described is how to collect virtual-only objects (or heap object) which were exported (see #6696). However we could maintain a similiar durable list of non-durable vrefs which were exported, and in the new incarnation, we would slowly iterate entries from the previous incarnations which have a totalCount === 0, drop the export, and add them to the "incarnation only set" (the non 0 totalCount would be dropped when their virtual references get dropped). #7170 implemented this recognition kernel-side after an upgrade, but this might cause a large amount of gc actions at once. A kernel side mitigation to large gc actions is described in #8930, but having vats being good citizens in the first place would help too (and avoid some of the layering violations).

@mhofman mhofman added enhancement New feature or request liveslots requires vat-upgrade to deploy changes labels May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request liveslots requires vat-upgrade to deploy changes
Projects
None yet
Development

No branches or pull requests

1 participant