-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(swingset): use "dirt" to schedule vat reap/bringOutYourDead #9169
base: master
Are you sure you want to change the base?
Conversation
62aa511
to
0fe9f39
Compare
Deploying agoric-sdk with Cloudflare Pages
|
0fe9f39
to
967e458
Compare
Previously, vat state initialization (e.g. setting counters to zero) happened lazily, the first time that `provideVatKeeper()` was called. When creating a new vat, the pattern was: vk = kernelKeeper.provideVatKeeper(); vk.setSourceAndOptions(source, options); Now, we initialize both counters and source/options explicitly, in `recordVatOptions`, when the static/dynamic vat is first defined: kernelKeeper.createVatState(vatID, source, options); In the future, this will make it easier for `provideVatKeeper` to rely upon the presence of all vat state keys, especially `options`. Previously, vatKeeper.getOptions() would tolerate a missing key by returning empty options. The one place where this was needed (terminating a barely-created vat because startVat() failed, using getOptions().critical) was changed, so this tolerance is no longer needed, and was removed. The tolerance caused bug #9157 (kernel doesn't panic when it should), which continues to be present, but at least won't cause a crash. refs #8980
`dispatch.bringOutYourDead()`, aka "reap", triggers garbage collection inside a vat, and gives it a chance to drop imported c-list vrefs that are no longer referenced by anything inside the vat. Previously, each vat has a configurable parameter named `reapInterval`, which defaults to a kernel-wide `defaultReapInterval` (but can be set separately for each vat). This defaults to 1, mainly for unit testing, but real applications set it to something like 200. This caused BOYD to happen once every 200 deliveries, plus an extra BOYD just before we save an XS heap-state snapshot. This commit switches to a "dirt"-based BOYD scheduler, wherein we consider the vat to get more and more dirty as it does work, and eventually it reaches a `reapDirtThreshold` that triggers the BOYD (which resets the dirt counter). We continue to track `dirt.deliveries` as before, with the same defaults. But we add a new `dirt.gcKrefs` counter, which is incremented by the krefs we submit to the vat in GC deliveries. For example, calling `dispatch.dropImports([kref1, kref2])` would increase `dirt.gcKrefs` by two. The `reapDirtThreshold.gcKrefs` limit defaults to 20. For normal use patterns, this will trigger a BOYD after ten krefs have been dropped and retired. We choose this value to allow the #8928 slow vat termination process to trigger BOYD frequently enough to keep the BOYD cranks small: since these will be happening constantly (in the "background"), we don't want them to take more than 500ms or so. Given the current size of the large vats that #8928 seeks to terminate, 10 krefs seems like a reasonable limit. And of course we don't want to perform too many BOYDs, so `gcKrefs: 20` is about the smallest threshold we'd want to use. External APIs continue to accept `reapInterval`, and now also accept `reapGCKrefs`. * kernel config record * takes `config.defaultReapInterval` and `defaultReapGCKrefs` * takes `vat.NAME.creationOptions.reapInterval` and `.reapGCKrefs` * `controller.changeKernelOptions()` still takes `defaultReapInterval` but now also accepts `defaultReapGCKrefs` The APIs available to userspace code (through `vatAdminSvc`) are unchanged (partially due to upgrade/backwards-compatibility limitations), and continue to only support setting `reapInterval`. Internally, this just modifies `reapDirtThreshold.deliveries`. * `E(vatAdminSvc).createVat(bcap, { reapInterval })` * `E(adminNode).upgrade(bcap, { reapInterval })` * `E(adminNode).changeOptions({ reapInterval })` Internally, the kernel-wide state records `defaultReapDirtThreshold` instead of `defaultReapInterval`, and each vat records `.reapDirtThreshold` in their `vNN.options` key instead of `vNN.reapInterval`. The current dirt level is recorded in `vNN.reapDirt`. The kernel will automatically upgrade both the kernel-wide and the per-vat state upon the first reboot with the new kernel code. The old `reapCountdown` value is used to initialize the vat's `reapDirt.deliveries` counter, so the upgrade shouldn't disrupt the existing schedule. Vats which used `reapInterval = 'never'` (eg comms) will get a `reapDirtThreshold` of all 'never' values, so they continue to inhibit BOYD. Otherwise, all vats get a `threshold.gcKrefs` of 20. We do not track dirt when the corresponding threshold is 'never', to avoid incrementing the comms dirt counters forever. This design leaves room for adding `.computrons` to the dirt record, as well as tracking a separate `snapshotDirt` counter (to trigger XS heap snapshots, ala #6786). We add `reapDirtThreshold.computrons`, but do not yet expose an API to set it. Future work includes: * upgrade vat-vat-admin to let userspace set `reapDirtThreshold` New tests were added to exercise the upgrade process, and other tests were updated to match the new internal initialization pattern. We now reset the dirt counter upon any BOYD, so this also happens to help with #8665 (doing a `reapAllVats()` resets the delivery counters, so future BOYDs will be delayed, which is what we want). But we should still change `controller.reapAllVats()` to avoid BOYDs on vats which haven't received any deliveries. closes #8980
967e458
to
402811a
Compare
I am somewhat concerned that the cost of engine gc might mean 20 is too low. How do we validate we won't spend too much time in BOYD? Do we have an idea of what portion of the time spent in BOYD is related to the engine doing a gc mark and sweep, and how much of it is liveslots reacting to the found garbage? I assume liveslots usually has vatStore calls to make when garbage was found?
This does not sound scalable for adding new dirt thresholds. Why not add a
I'll need to look at this. It sounds like a sort of schema migration.
Edit: I totally misunderstood |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mostly looking good.
I think we're missing a clearReapDirt
for the BOYD injected when creating a snapshot.
Wondering if the schema migration should be handled more explicitly when starting the kernel instead of optimistically running it before each crank.
I'll need to do another pass and review the tests.
assert(options.workerOptions, `vat ${vatID} options missing workerOptions`); | ||
assert(source); | ||
assert('bundle' in source || 'bundleName' in source || 'bundleID' in source); | ||
assert.typeof(options, 'object'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should check that options
is a non-null
object before checking that optons.workerOptions
exists, but this seems to be a problem with the original setSourceAndOptions
too.
Which brings me to the second issue: I'm not a major fan of duplicating the logic between there and here. Any way to refactor to use the same underlying implementation for both the checks and saving the values to the kvStore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aye, I'll change the order of the asserts
I guess I could factor out a two-line function (plus some asserts), and call it from both places, but it didn't seem worth it at the time, and I didn't think of a good name for it. doSetSourceAndOptions(kvStore, source, options)
?
It's important that initializeVatState()
remain a standalone function, not a method on vatKeeper
, because it must be called before creating a vatKeeper()
(which asserts the presence of .options
, to populate the reapDirtThreshold
RAM cache).
I'll see what I can do.
/** | ||
* @typedef { import('../types-external.js').SwingSetKernelConfig } SwingSetKernelConfig | ||
* @typedef { import('../types-external.js').SwingStoreKernelStorage } SwingStoreKernelStorage | ||
* @typedef { import('../types-internal.js').InternalKernelOptions } InternalKernelOptions | ||
* @typedef { import('../types-internal.js').ReapDirtThreshold } ReapDirtThreshold | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We switched to TS 5.5 so we can now use @import
! See #9149
That said, there are a lot of other places in kernel code where we should switch, so I'm ok punting for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I'd like to do that cleanup in a separate PR, and I'm concerned about how it will interact with our current branch policy (cherry-picking changes to a release branch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call
@@ -379,7 +380,7 @@ export default function buildKernel( | |||
* didDelivery?: VatID, // we made a delivery to a vat, for run policy and save-snapshot | |||
* computrons?: BigInt, // computron count for run policy | |||
* meterID?: string, // deduct those computrons from a meter | |||
* decrementReapCount?: { vatID: VatID }, // the reap counter should decrement | |||
* measureDirt?: [ VatID, Dirt ], // the dirt counter should increment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to switch to a tuple instead of a record here for the return value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No good ones, I think I ran into a name collision during development, but I can't find evidence of it now, so I'll switch to a record.
@@ -601,6 +607,8 @@ export default function buildKernel( | |||
if (!vatWarehouse.lookup(vatID)) { | |||
return NO_DELIVERY_CRANK_RESULTS; // can't collect from the dead | |||
} | |||
const vatKeeper = kernelKeeper.provideVatKeeper(vatID); | |||
vatKeeper.addDirt({ gcKrefs: krefs.length }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not plumb this into deliveryCrankResults
and rely the handling of crank results to add dirt?
Let's assume we started counting the number of refs introduced in every delivery as a dirt counter (the more you have, the more likely you are to have some that are now garbage). That number is also derived from the message. In that case, does it make sense to increment here or in the results? I suppose the handling of abort/terminate may have an influence on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would work. It would look like
function deliveryCrankResults(vatID, status, measureDirt, meterID, gcKrefs) {
...
if (measureDirt && ..) {
if (gcKrefs) {
dirt.gcKrefs = gcKrefs;
}
..
}
...
deliveryCrankResults(vatID, status, true, undefined, krefs.length);
IIRC I balked at adding another positional argument to deliveryCrankResults()
which would only be used by processGCMessage()
but would need to be provided by all callers, especially since we've already got the optional meterID
posarg, and stopped short of converting the posargs to an options bag.
I think the abort/terminate shouldn't be a big influence.. I don't know exactly whether we should increment the dirt counter if the delivery caused a vat-fatal error or not, but I think either way would be ok.
I'll look to see if there's a decent way to refactor it as you suggest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah definitely should avoid more positional args. That's what named args (destructured records are for).
Btw, the above was a light suggestion, I'm really not sure if it makes sense. I just had some trouble following the flow of where dirt gets changed.
// the caller is responsible for ensuring that any changes made here | ||
// get committed at a useful point |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some worry that a rollback could revert the kernel state but would leave the upgraded
flag in place, but it looks like this is always called outside savepoints. Why not run this at the beginning of every startCrank()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there's one upgraded
flag per KernelKeeper
object, and each kernel has exactly one KernelKeeper
(created at the start of buildKernel()
), but we also create a KernelKeeper
when we initialize/populate the DB, and we should not be doing an upgrade at that time, which is why maybeUpgradeKernelState()
is a method and not an automatic consequence of makeKernelKeeper()
.
It's currently called just before each call to startCrank()
.. I wanted to make it more explicit, because (as you noted) it's important to call it from outside a savepoint. I was worried that adding it to startCrank()
would obscure the upgrade point and make it fragile under maintenance.
But.. in looking more carefully at how/when startCrank
is called (it also show up in changeKernelOptions
and installBundle
, to which I did not add a maybeUpgrade
call), I think my PR would have a small bug. kernelKeeper.startCrank()
calls kernelStorage.startCrank()
in swing-store, which sets an inCrank = true
flag, and calls resetCrankHash()
. By not calling startCrank
before maybeUpgrade
(or not calling it at all, if we start with changeKernelOptions
or installBundle
, which sounds pretty plausible TBH), we don't include any upgrade-related kvStore changes in the crankhash. That's not the intent: upgrades are in-consensus, all nodes of a consensus kernel should upgrade at the same time, and their changes should be included in the crankhash.
So I think I need to change this as you suggest: call maybeUpgradeKernelState()
from inside kernelKeeper.startCrank()
, but do it specifically after that function calls kernelStorage.startCrank()
. I'm not excited to conceal the fact that upgrades are happening from call sites like step()
and run()
, but it's more important to have these upgrade changes be captured in the hash reliably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also create a
KernelKeeper
when we initialize/populate the DB, and we should not be doing an upgrade at that time, which is whymaybeUpgradeKernelState()
is a method and not an automatic consequence ofmakeKernelKeeper()
.
That had totally skipped my mind. Can we document somewhere these constraints?
@@ -68,7 +74,8 @@ const enableKernelGC = true; | |||
// bundle.$BUNDLEID = JSON(bundle) | |||
// | |||
// kernel.defaultManagerType = managerType | |||
// kernel.defaultReapInterval = $NN | |||
// (old) kernel.defaultReapInterval = $NN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we have a "formal" schema v0 designation. At some point I assume we'll have multiple "old"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we're still in that fuzzy ad-hoc stage of this table's maturity curve, but I'll improve the comments and write down the definitions of v0/v1/whatever even though we don't have a single key that says what version the kvStore is at. For swingstore I'm working on making it precise (and doing a full+atomic migration from vN to vN+1), but for this table, the weird thing was that vatKeepers have an "upgrade your keys" function, which the kernel calls at startup (sometimes), so there's a brief window during this call where vat4 has migrated but vat5 has not, and that didn't lend itself to declaring the entire kvStore as at "version 2", and I didn't want to introduce per-vat version keys either.
return reapDirt; | ||
} | ||
|
||
function clearReapDirt() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to also call this when injecting the BOYD message in maybeSaveSnapshot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I'll see if there's a way to factor out BOYD delivery to have just one place to call it from (we actually have three code paths: processBringOutYourDead
, processUpgradeVat
as the last thing sent to the old incarnation, and maybeSaveSnapshot
). Ideally all three would call processBringOutYourDead
but that wasn't trivial. I'll see what I can do, worst case I'll just add another clearReapDirt()
call like you suggested.
// this default "reap interval" is low for the benefit of tests: | ||
// applications should set it to something higher (perhaps 200) based | ||
// on their expected usage | ||
|
||
export const DEFAULT_DELIVERIES_PER_BOYD = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am somewhat concerned with a default value that is not "production ready", but this seems to be the existing behavior already, so out of scope. Is the builder for a test kernel the same as the production one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The (weak) philosophy here was that we don't know what a good value would be, so make the default a predictable and hyper-safe one, at the cost of performance. Making it convenient for tests is another consideration, although of course it's always a lazy and inadvisable choice to prioritize tests over reality.
We might even make a hand-wavy argument that this matches XS: minimizing RAM usage (ie GCing frequently) at the cost of performance.
Is the builder for a test kernel the same as the production one?
Not sure what you mean, our most important host application (cosmic-swingset) overrides this value, and in general we expect all host applictions to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was remembering that all tests used a single function to build the kernel, in which case we could have a value more appropriate for production as a default here. Anyway, as stated this not really in scope for this PR.
// We keep the vat's reap-dirt state (and threshold) in a | ||
// write-through cache, and populate the cache at vatKeeper startup. | ||
const reapDirtKey = `${vatID}.reapDirt`; | ||
let reapDirt = JSON.parse(getRequired(reapDirtKey)); | ||
let reapDirtThreshold = getOptions().reapDirtThreshold; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having funny feelings about ad-hoc caching, in particular because I don't fully understand the impact of savepoints and crank rollbacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably haven't considered it thoroughly enough, but my basic idea was that crank rollbacks still happen in consensus, and still take time, so some of our reactions to their costs should still happen even if the crank gets rolled back. Like, the runPolicy should be told about computrons spent inside rolled-back cranks (we don't get more time for a block just because the vat did something weird and needs to forget the delivery).
Obviously that doesn't apply to the amount of dirt inside a vat: abandoning the external consequences of the delivery will also abandon the vat-internal changes, like an increase in the amount of dirt. So I agree it would be appropriate to not increment reapDirt
in the case of an abandoned crank.
I wanted to avoid too much extra DB activity for every crank. We definitely need to write out the new dirt value, so it's persistent (and this triggers an export-data
IAVL change, sigh). We almost certainly shouldn't read the dirt threshold each time, since it rarely changes (and only by setSourceAndOptions
, which happens in its own non-delivery crank, which isn't subject to rollback since there's no vat).
I currently call addDirt()
from two places (the generic one in processDeliveryMessage
, and the gcKrefs
-only one in processGCMessage
), so if I remove the reapDirt
cache and do a full DB read+increment+write cycle each time, I'm paying 2 or 4 DB queries per crank. If I can implement your other suggestion to pass gcKrefs
into deliveryCrankResults()
, then that is reduced to 2 per crank.
So I think I'll:
- merge gcKrefs into
deliveryCrankResults()
- remove the RAM cache of
reapDirt
, doing 2 DB actions instead of 1 each delivery - leave the
reapDirtThreshold
RAM cache
// The 'moreDirt' value might be Number or BigInt (eg | ||
// .computrons). We coerce to Number so we can JSON-stringify. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we converted the computrons
to Number before putting the value in the dirt record?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a duplicate coercion, yeah, I figured that was harmless.
This one was annoying, I first tried to make gcKrefs
be a BigInt, to match how we handle computrons, but then I can't use simple JSON to store reapDirt
or reapDirtThreshold
, and holding it in RAM as a BigInt but converting back/forth to Number in the save/load functions was messy and distracting. Also, my choice to use key is missing
as "no dirt of that category" made things slightly harder (the || 0
is to ignore gcKrefs: undefined
-like cases).
And I didn't want to go back and change computrons
to be Number everything, that was too much churn. I decided that I'd treat computrons as potentially large, and gcKrefs as always pretty small, and use that to justify leaving computrons as BigInts in the existing code. But then I ignore that argument in addDirt
and do not tolerate large computron sizes. Sigh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah Number
is totally fine here for dirt computrons. Even if we go over the MAX_SAFE_INTEGER value, the loss of precision is deterministic and doesn't impact the threshold logic.
We certainly need to experiment on real data and tune this. The " My notes/predictions are in #8928 (comment) . This might actually be too aggressive, but if so we can reduce the cleanup rate instead of reducing the My worry about increasing Also, your question means I should measure the rate of non-cleanup-related
I started out doing it that way, but it triggered a lot of test churn, and made it hard to distinguish between "I'm setting a threshold and I don't care about property X, use your default", vs "I'm setting a threshold and I want property X to never trigger BOYD", mostly because I stuck with a type of I'll take another look and see if there's a way to get a cleaner API, but I have to admit it's not my highest priority.
It is, for sure. I thought briefly about adding a
import { makeSwingsetController, maybeUpgradeKernelState } from '@agoric/swingset-vat';
const { kernelStorage, hostStorage } = openSwingStore(..);
maybeUpgradeKernelState(kernelStorage);
hostStorage.commit();
const controller = makeSwingsetController(kernelStorage, ..);
.. We've had similar discussions in #8089, and I think you convinced me to make it explicit, where the host app must call a function (at the right time), and not have the library perform an automatic upgrade, but then export-data considerations led us both to give up on using an explicit call, and instead having upgrade be implicit during the I think we've got similar issues here: the reapDirt upgrades will cause kvStore changes, and export-data/IAVL changes, that need to be committed. The situation is better, though, because we already have the I'll try to think about how we could make this an explicit call.
Right, we'd have |
I went to move the upgrade code into I thought about our plans for swing-store schema migration (concluding by this comment), and the constraints therein, and how it applies to this upgrade step. We don't have the same "where do the export-data changes show up?" problem, since Each kernel reboot uses some particular version of the In a consensus system, each "block" (or whatever they call the coordinated units of work) is executed with some particular version of the kernel/swingstore. The host-app's responsbility is to ensure that each block is executed with the same version, across all workers. You must upgrade your kernel at the same time as everybody else, even though you're allowed to make non-upgrading reboots more frequently than that. So, I think I see three options:
I prefer being very explicit about when DB changes might happen, so I don't like C. And it feels odd to have So I'm going to pick A, where we have a separate I'm also going to move to the explicit kvStore key that says what version the schema is, and graduate from our "ad-hoc" phase . I need something for And then I'll need to look at how cosmic-swingset calls this and make sure I'm not creating any problems. We'll already need to handle mutation (and thus export-data callbacks being invoked) with the #8089 changes, because the constraints there obligate us to perform upgrades as part of Note that this means the upgrade changes won't be included in any crank, and they won't be included in a crankhash. (To be precise, they'll get hashed into the |
That sounds right. cosmic-swingset only builds the kernel as part of an init message, which is always done while processing a block at the cosmos level, so calls to the bridge are safe during that time.
This part is a little unfortunate, but I guess acceptable. If we ever change how we export the swing-store kv data, we'll likely still have data to validate it in IAVL. The unfortunate part is that right now we would have no way to detect this difference using the slog only, and exporting and comparing swing-store and iavl trees is difficult compared to comparing slog entries (e.g. from a flight recorder). Can we maybe add a new slog entry for upgrades, which maybe captures the crankHasher state at the end even if it doesn't end up chained like a normal crank? |
dispatch.bringOutYourDead()
, aka "reap", triggers garbage collectioninside a vat, and gives it a chance to drop imported c-list vrefs that
are no longer referenced by anything inside the vat.
Previously, each vat has a configurable parameter named
reapInterval
, which defaults to a kernel-widedefaultReapInterval
(but can be set separately for each vat). Thisdefaults to 1, mainly for unit testing, but real applications set it
to something like 200.
This caused BOYD to happen once every 200 deliveries, plus an extra
BOYD just before we save an XS heap-state snapshot.
This commit switches to a "dirt"-based BOYD scheduler, wherein we
consider the vat to get more and more dirty as it does work, and
eventually it reaches a
reapDirtThreshold
that triggers theBOYD (which resets the dirt counter).
We continue to track
dirt.deliveries
as before, with the samedefaults. But we add a new
dirt.gcKrefs
counter, which isincremented by the krefs we submit to the vat in GC deliveries. For
example, calling
dispatch.dropImports([kref1, kref2])
would increasedirt.gcKrefs
by two.The
reapDirtThreshold.gcKrefs
limit defaults to 20. For normal usepatterns, this will trigger a BOYD after ten krefs have been dropped
and retired. We choose this value to allow the #8928 slow vat
termination process to trigger BOYD frequently enough to keep the BOYD
cranks small: since these will be happening constantly (in the
"background"), we don't want them to take more than 500ms or so. Given
the current size of the large vats that #8928 seeks to terminate, 10
krefs seems like a reasonable limit. And of course we don't want to
perform too many BOYDs, so
gcKrefs: 20
is about the smallestthreshold we'd want to use.
External APIs continue to accept
reapInterval
, and now also acceptreapGCKrefs
.config.defaultReapInterval
anddefaultReapGCKrefs
vat.NAME.creationOptions.reapInterval
and.reapGCKrefs
controller.changeKernelOptions()
still takesdefaultReapInterval
but now also accepts
defaultReapGCKrefs
The APIs available to userspace code (through
vatAdminSvc
) areunchanged (partially due to upgrade/backwards-compatibility
limitations), and continue to only support setting
reapInterval
.Internally, this just modifies
reapDirtThreshold.deliveries
.E(vatAdminSvc).createVat(bcap, { reapInterval })
E(adminNode).upgrade(bcap, { reapInterval })
E(adminNode).changeOptions({ reapInterval })
Internally, the kernel-wide state records
defaultReapDirtThreshold
instead of
defaultReapInterval
, and each vat records.reapDirtThreshold
in theirvNN.options
key instead ofvNN.reapInterval
. The current dirt level is recorded invNN.reapDirt
.The kernel will automatically upgrade both the kernel-wide and the
per-vat state upon the first reboot with the new kernel code. The old
reapCountdown
value is used to initialize the vat'sreapDirt.deliveries
counter, so the upgrade shouldn't disrupt theexisting schedule. Vats which used
reapInterval = 'never'
(eg comms)will get a
reapDirtThreshold
of all 'never' values, so they continueto inhibit BOYD. Otherwise, all vats get a
threshold.gcKrefs
of 20.We do not track dirt when the corresponding threshold is 'never', to
avoid incrementing the comms dirt counters forever.
This design leaves room for adding
.computrons
to the dirt record,as well as tracking a separate
snapshotDirt
counter (to trigger XSheap snapshots, ala #6786). We add
reapDirtThreshold.computrons
, butdo not yet expose an API to set it.
Future work includes:
reapDirtThreshold
New tests were added to exercise the upgrade process, and other tests
were updated to match the new internal initialization pattern.
We now reset the dirt counter upon any BOYD, so this also happens to
help with #8665 (doing a
reapAllVats()
resets the delivery counters,so future BOYDs will be delayed, which is what we want). But we should
still change
controller.reapAllVats()
to avoid BOYDs on vats whichhaven't received any deliveries.
closes #8980