Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for keeping pooling allocator pages resident #5207

Merged

Conversation

alexcrichton
Copy link
Member

When new wasm instances are created repeatedly in high-concurrency environments one of the largest bottlenecks is the contention on kernel-level locks having to do with the virtual memory. It's expected that usage in this environment is leveraging the pooling instance allocator with the memory-init-cow feature enabled which means that the kernel level VM lock is acquired in operations such as:

  1. Growing a heap with mprotect (write lock)
  2. Faulting in memory during usage (read lock)
  3. Resetting a heap's contents with madvise (read lock)
  4. Shrinking a heap with mprotect when reusing a slot (write lock)

Rapid usage of these operations can lead to detrimental performance especially on otherwise heavily loaded systems, worsening the more frequent the above operations are. This commit is aimed at addressing the (2) case above, reducing the number of page faults that are fulfilled by the kernel.

Currently these page faults happen for three reasons:

  • When memory is first accessed after the heap is grown.
  • When the initial linear memory image is accessed for the first time.
  • When the initial zero'd heap contents, not part of the linear memory image, are accessed.

This PR is attempting to address the latter of these cases, and to a lesser extent the first case as well. Specifically this PR provides the ability to partially reset a pooled linear memory with memset rather than madvise. This is done to have the same effect of resetting contents to zero but namely has a different effect on paging, notably keeping the pages resident in memory rather than returning them to the kernel. This means that reuse of a linear memory slot on a page that was previously memset will not trigger a page fault since everything remains paged into the process.

The end result is that any access to linear memory which has been touched by memset will no longer page fault on reuse. On more recent kernels (6.0+) this also means pages which were zero'd by memset, made inaccessible with PROT_NONE, and then made accessible again with PROT_READ | PROT_WRITE will not page fault. This can be common when a wasm instances grows its heap slightly, uses that memory, but then it's shrunk when the memory is reused for the next instance. Note that this kernel optimization requires a 6.0+ kernel.

This same optimization is furthermore applied to both async stacks with the pooling memory allocator in addition to table elements. The defaults of Wasmtime are not changing with this PR, instead knobs are being exposed for embedders to turn if they so desire. This is currently being experimented with at Fastly and I may come back and alter the defaults of Wasmtime if it seems suitable after our measurements.

When new wasm instances are created repeatedly in high-concurrency
environments one of the largest bottlenecks is the contention on
kernel-level locks having to do with the virtual memory. It's expected
that usage in this environment is leveraging the pooling instance
allocator with the `memory-init-cow` feature enabled which means that
the kernel level VM lock is acquired in operations such as:

1. Growing a heap with `mprotect` (write lock)
2. Faulting in memory during usage (read lock)
3. Resetting a heap's contents with `madvise` (read lock)
4. Shrinking a heap with `mprotect` when reusing a slot (write lock)

Rapid usage of these operations can lead to detrimental performance
especially on otherwise heavily loaded systems, worsening the more
frequent the above operations are. This commit is aimed at addressing
the (2) case above, reducing the number of page faults that are
fulfilled by the kernel.

Currently these page faults happen for three reasons:

* When memory is first accessed after the heap is grown.
* When the initial linear memory image is accessed for the first time.
* When the initial zero'd heap contents, not part of the linear memory
  image, are accessed.

This PR is attempting to address the latter of these cases, and to a
lesser extent the first case as well. Specifically this PR provides the
ability to partially reset a pooled linear memory with `memset` rather
than `madvise`. This is done to have the same effect of resetting
contents to zero but namely has a different effect on paging, notably
keeping the pages resident in memory rather than returning them to the
kernel. This means that reuse of a linear memory slot on a page that was
previously `memset` will not trigger a page fault since everything
remains paged into the process.

The end result is that any access to linear memory which has been
touched by `memset` will no longer page fault on reuse. On more recent
kernels (6.0+) this also means pages which were zero'd by `memset`, made
inaccessible with `PROT_NONE`, and then made accessible again with
`PROT_READ | PROT_WRITE` will not page fault. This can be common when a
wasm instances grows its heap slightly, uses that memory, but then it's
shrunk when the memory is reused for the next instance. Note that this
kernel optimization requires a 6.0+ kernel.

This same optimization is furthermore applied to both async stacks with
the pooling memory allocator in addition to table elements. The defaults
of Wasmtime are not changing with this PR, instead knobs are being
exposed for embedders to turn if they so desire. This is currently being
experimented with at Fastly and I may come back and alter the defaults
of Wasmtime if it seems suitable after our measurements.
@alexcrichton alexcrichton enabled auto-merge (squash) November 4, 2022 20:20
@github-actions github-actions bot added fuzzing Issues related to our fuzzing infrastructure wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:config Issues related to the configuration of Wasmtime labels Nov 4, 2022
@github-actions
Copy link

github-actions bot commented Nov 4, 2022

Subscribe to Label Action

cc @fitzgen, @peterhuene

This issue or pull request has been labeled: "fuzzing", "wasmtime:api", "wasmtime:config"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: fuzzing
  • peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@github-actions
Copy link

github-actions bot commented Nov 4, 2022

Label Messager: wasmtime:config

It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:

  • If you added a new Config method, you wrote extensive documentation for
    it.

    Our documentation should be of the following form:

    Short, simple summary sentence.
    
    More details. These details can be multiple paragraphs. There should be
    information about not just the method, but its parameters and results as
    well.
    
    Is this method fallible? If so, when can it return an error?
    
    Can this method panic? If so, when does it panic?
    
    # Example
    
    Optional example here.
    
  • If you added a new Config method, or modified an existing one, you
    ensured that this configuration is exercised by the fuzz targets.

    For example, if you expose a new strategy for allocating the next instance
    slot inside the pooling allocator, you should ensure that at least one of our
    fuzz targets exercises that new strategy.

    Often, all that is required of you is to ensure that there is a knob for this
    configuration option in wasmtime_fuzzing::Config (or one
    of its nested structs).

    Rarely, this may require authoring a new fuzz target to specifically test this
    configuration. See our docs on fuzzing for more details.

  • If you are enabling a configuration option by default, make sure that it
    has been fuzzed for at least two weeks before turning it on by default.


To modify this label's message, edit the .github/label-messager/wasmtime-config.md file.

To add new label messages or remove existing label messages, edit the
.github/label-messager.json configuration file.

Learn more.

@alexcrichton alexcrichton merged commit d3a6181 into bytecodealliance:main Nov 4, 2022
@alexcrichton alexcrichton deleted the keep-resident-options branch November 4, 2022 20:56
alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Nov 4, 2022
This is a continuation of the thrust in bytecodealliance#5207 for reducing page faults
and lock contention when using the pooling allocator. To that end this
commit implements support for efficient memory management in the pooling
allocator when using wasm that is instrumented with bounds checks.

The `MemoryImageSlot` type now avoids unconditionally shrinking memory
back to its initial size during the `clear_and_remain_ready` operation,
instead deferring optional resizing of memory to the subsequent call to
`instantiate` when the slot is reused. The instantiation portion then
takes the "memory style" as an argument which dictates whether the
accessible memory must be precisely fit or whether it's allowed to
exceed the maximum. This in effect enables skipping a call to `mprotect`
to shrink the heap when dynamic memory checks are enabled.

In terms of page fault and contention this should improve the situation
by:

* Fewer calls to `mprotect` since once a heap grows it stays grown and
  it never shrinks. This means that a write lock is taken within the
  kernel much more rarely from before (only asymptotically now, not
  N-times-per-instance).

* Accessed memory after a heap growth operation will not fault if it was
  previously paged in by a prior instance and set to zero with `memset`.
  Unlike bytecodealliance#5207 which requires a 6.0 kernel to see this optimization this
  commit enables the optimization for any kernel.

The major cost of choosing this strategy is naturally the performance
hit of the wasm itself. This is being looked at in PRs such as bytecodealliance#5190 to
improve Wasmtime's story here.

This commit does not implement any new configuration options for
Wasmtime but instead reinterprets existing configuration options. The
pooling allocator no longer unconditionally sets
`static_memory_bound_is_maximum` and then implements support necessary
for this memory type. This other change to this commit is that the
`Tunables::static_memory_bound` configuration option is no longer gating
on the creation of a `MemoryPool` and it will now appropriately size to
`instance_limits.memory_pages` if the `static_memory_bound` is to small.
This is done to accomodate fuzzing more easily where the
`static_memory_bound` will become small during fuzzing and otherwise the
configuration would be rejected and require manual handling. The spirit
of the `MemoryPool` is one of large virtual address space reservations
anyway so it seemed reasonable to interpret the configuration this way.
alexcrichton added a commit that referenced this pull request Nov 8, 2022
* Implement support for dynamic memories in the pooling allocator

This is a continuation of the thrust in #5207 for reducing page faults
and lock contention when using the pooling allocator. To that end this
commit implements support for efficient memory management in the pooling
allocator when using wasm that is instrumented with bounds checks.

The `MemoryImageSlot` type now avoids unconditionally shrinking memory
back to its initial size during the `clear_and_remain_ready` operation,
instead deferring optional resizing of memory to the subsequent call to
`instantiate` when the slot is reused. The instantiation portion then
takes the "memory style" as an argument which dictates whether the
accessible memory must be precisely fit or whether it's allowed to
exceed the maximum. This in effect enables skipping a call to `mprotect`
to shrink the heap when dynamic memory checks are enabled.

In terms of page fault and contention this should improve the situation
by:

* Fewer calls to `mprotect` since once a heap grows it stays grown and
  it never shrinks. This means that a write lock is taken within the
  kernel much more rarely from before (only asymptotically now, not
  N-times-per-instance).

* Accessed memory after a heap growth operation will not fault if it was
  previously paged in by a prior instance and set to zero with `memset`.
  Unlike #5207 which requires a 6.0 kernel to see this optimization this
  commit enables the optimization for any kernel.

The major cost of choosing this strategy is naturally the performance
hit of the wasm itself. This is being looked at in PRs such as #5190 to
improve Wasmtime's story here.

This commit does not implement any new configuration options for
Wasmtime but instead reinterprets existing configuration options. The
pooling allocator no longer unconditionally sets
`static_memory_bound_is_maximum` and then implements support necessary
for this memory type. This other change to this commit is that the
`Tunables::static_memory_bound` configuration option is no longer gating
on the creation of a `MemoryPool` and it will now appropriately size to
`instance_limits.memory_pages` if the `static_memory_bound` is to small.
This is done to accomodate fuzzing more easily where the
`static_memory_bound` will become small during fuzzing and otherwise the
configuration would be rejected and require manual handling. The spirit
of the `MemoryPool` is one of large virtual address space reservations
anyway so it seemed reasonable to interpret the configuration this way.

* Skip zero memory_size cases

These are causing errors to happen when fuzzing and otherwise in theory
shouldn't be too interesting to optimize for anyway since they likely
aren't used in practice.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fuzzing Issues related to our fuzzing infrastructure wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:config Issues related to the configuration of Wasmtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants