You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While applying a batch to the memtable, we iterate over the batch and one-by-one we allocate memory for each individual KV pair's node in the memtable. Each individual KV performs an atomic load and an atomic increment. Allocating a contiguous swath of memory from the arena all at once may reduce the overhead of batch application. If KVs within a batch are more likely to be proximate to each other than to KVs being committed by other batches (almost certainly), this would also improve read-time cpu cache locality by avoiding interleaving KVs from separate concurrently-applied batches.
The amount of memory required for a node is a function of the size of the key, the size of the value, and the height of the skiplist node. The height of the skiplist node is random and not a function of the existing skiplist at all. It could be decided ahead of time before even entering the commit pipeline, but then we'd need a place to remember it. We could conceivably steal 5 bits from the trailer somewhere (20 is the max node height; ⌈log2(20)⌉ = 5) but that sounds delicate. Or, we could avoid explicitly deciding each individual KV's node's height pre-application, and instead accumulate an aggregate height of all nodes in the batch. Then, at batch application time, we'd need to randomly distribute the aggregate height to individual KVs. (This could use some thought.)
The text was updated successfully, but these errors were encountered:
While applying a batch to the memtable, we iterate over the batch and one-by-one we allocate memory for each individual KV pair's node in the memtable. Each individual KV performs an atomic load and an atomic increment. Allocating a contiguous swath of memory from the arena all at once may reduce the overhead of batch application. If KVs within a batch are more likely to be proximate to each other than to KVs being committed by other batches (almost certainly), this would also improve read-time cpu cache locality by avoiding interleaving KVs from separate concurrently-applied batches.
The amount of memory required for a node is a function of the size of the key, the size of the value, and the height of the skiplist node. The height of the skiplist node is random and not a function of the existing skiplist at all. It could be decided ahead of time before even entering the commit pipeline, but then we'd need a place to remember it. We could conceivably steal 5 bits from the trailer somewhere (20 is the max node height; ⌈log2(20)⌉ = 5) but that sounds delicate. Or, we could avoid explicitly deciding each individual KV's node's height pre-application, and instead accumulate an aggregate height of all nodes in the batch. Then, at batch application time, we'd need to randomly distribute the aggregate height to individual KVs. (This could use some thought.)
The text was updated successfully, but these errors were encountered: