Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io: rewrite slab to support compaction #2757

Merged
merged 16 commits into from Aug 12, 2020
Merged

io: rewrite slab to support compaction #2757

merged 16 commits into from Aug 12, 2020

Conversation

carllerche
Copy link
Member

@carllerche carllerche commented Aug 10, 2020

The I/O driver uses a slab to store per-resource state. Doing this
provides two benefits. First, allocating state is streamlined. Second,
resources may be safely indexed using a usize type. The usize is
used passed to the OS's selector when registering for receiving events.

The original slab implementation used a Vec backed by RwLock. This
primarily caused contention when reading state. This implementation also
only grew the slab capacity but never shrank. In #1625, the slab was
rewritten to use a lock-free strategy. The lock contention was removed
but this implementation was still grow-only.

This change adds the ability to release memory. Similar to the previous
implementation, it structures the slab to use a vector of pages. This
enables growing the slab without having to move any previous entries. It
also adds the ability to release pages. This is done by introducing a
lock when allocating/releasing slab entries. This does not impact
benchmarks, primarily due to the existing implementation not being
"done" and also having a lock around allocating and releasing.

A Slab::compact() function is added. Pages are iterated. When a page
is found with no slots in use, the page is freed. The compact()
function is called occasionally by the I/O driver.

@carllerche carllerche added C-enhancement Category: A PR with an enhancement or bugfix. A-tokio Area: The main tokio crate M-io Module: tokio/io labels Aug 10, 2020
@carllerche carllerche requested a review from hawkw August 10, 2020 23:04
The I/O driver uses a slab to store per-resource state. Doing this
provides two benefits. First, allocating state is streamlined. Second,
resources may be safetly indexed using a `usize` type. The `usize` is
used passed to the OS's selector when registering for receiving events.

The original slab implementation used a `Vec` backed by `RwLock`. This
primarily caused contention when reading state. This implementation also
only **grew** the slab capacity but never shrank. In #1625, the slab was
rewritten to use a lock-free strategy. The lock contention was removed
but this implementation was still grow-only.

This change adds the ability to release memory. Similar to the previous
implementation, it structures the slab to use a vector of pages. This
enables growing the slab without having to move any previous entries. It
also adds the ability to release pages. This is done by introducing a
lock when allocating / releasing slab entries. This does not impact
benchmarks, primarily due to the existing implementation not being
"done" and also having a lock around allocating and releasing.

A `Slab::compact()` function is added. Pages are iterated. When a page
is found with no slots in use, the page is freed. The `compact()`
function is called occassionally by the I/O driver.
@carllerche
Copy link
Member Author

I should add that I first attempted to modify the current slab implementation to support releasing pages. However, I did not see an obvious way to make this work. The primary issue is around the requirement to be able to index entries in the slab. Without restructuring the slab, there would be a race between indexing the slab and releasing pages.

The new design supports safely indexing and compacting by forcing both compaction and indexing to go via the same type that is not Clone. This allows leveraging rust lifetimes to avoid having to lock .

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice. After a quick skim, I didn't see anything concerning, and the overall approach seems nice. It would be good to have a little more commentary on a couple things, including the caching scheme, but overall, it's super clear and well documented.

I'd like to give it a closer look, but so far, it seems right to me!

Have we done any performance testing vs. the prior implementation? I'm curious (though not concerned) to see if there's a noticeable difference.

tokio/src/io/driver/scheduled_io.rs Outdated Show resolved Hide resolved
tokio/src/io/driver/mod.rs Show resolved Hide resolved
tokio/src/util/slab.rs Outdated Show resolved Hide resolved
tokio/src/util/slab.rs Show resolved Hide resolved
tokio/src/util/slab.rs Show resolved Hide resolved
tokio/src/util/slab.rs Outdated Show resolved Hide resolved
tokio/src/util/slab.rs Show resolved Hide resolved
tokio/src/util/slab.rs Outdated Show resolved Hide resolved
tokio/src/util/slab.rs Show resolved Hide resolved
@carllerche
Copy link
Member Author

@hawkw Re performance, I ran basic benches & hyper and nothing stood out. Is there something specific you are worried about? What change do you think could cause a perf regression?

@hawkw
Copy link
Member

hawkw commented Aug 11, 2020

@hawkw Re performance, I ran basic benches & hyper and nothing stood out. Is there something specific you are worried about? What change do you think could cause a perf regression?

Nothing I was worried about, just curious. The new implementation seems simpler enough that I was wondering if indexing might be a little faster, but I wasn't really expecting a big change.

@hawkw
Copy link
Member

hawkw commented Aug 11, 2020

CI failure looks like it's my fault --- an upstream tracing-attributes change seems to have broken MSRV.

Edit: huh, I'm surprised anything in our dependency tree is even pulling tracing-attributes...tokio disables the feature flag for attributes, and AFAICT we don't depend on anything else that uses tracing. What the heck!

Copy link
Contributor

@Darksonn Darksonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only looked at slab.rs.

tokio/src/util/slab.rs Outdated Show resolved Hide resolved
tokio/src/util/slab.rs Outdated Show resolved Hide resolved
tokio/src/util/slab.rs Outdated Show resolved Hide resolved
tokio/src/util/slab.rs Outdated Show resolved Hide resolved
tokio/src/util/slab.rs Show resolved Hide resolved
@carllerche
Copy link
Member Author

@hawkw @Darksonn I believe I addressed all feedback.

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no remaining blockers, this looks good to me. the implementation seems solid, and it's very well documented! good stuff!

@@ -296,24 +300,26 @@ impl Registration {
// AND write. A specific case that `EPOLLERR` occurs is when the read
// end of a pipe is closed. When this occurs, a peer blocked by
// writing to the pipe should be notified.
let curr_ready = sched
.set_readiness(self.address, |curr| curr & (!mask_no_hup))
.unwrap_or_else(|_| panic!("address {:?} no longer valid!", self.address));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it looks like the previous panic message was a bit more helpful..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call to set_readiness now can't hit a code path that returns Err. The only panic message that makes sense is "Tokio hit a bug, plz report" :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can panic w/ unreachable!() instead...

@@ -53,6 +43,10 @@ impl Pack {
(base & !self.mask) | (value << self.shift)
}

pub(crate) fn pack_lossy(&self, value: usize, base: usize) -> usize {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: may be worth a comment explaining what this does, particularly that bits may be lost on both the least- and most-significant ends of value?

Comment on lines +17 to +54
/// # Pages
///
/// Unlike `Vec`, growing does not require moving existing elements. Instead of
/// being a continuous chunk of memory for all elements, `Slab` is an array of
/// arrays. The top-level array is an array of pages. Each page is 2x bigger
/// than the previous one. When the slab grows, a new page is allocated.
///
/// Pages are lazily initialized.
///
/// # Allocating
///
/// When allocating an object, first previously used slots are reused. If no
/// previously used slot is available, a new slot is initialized in an existing
/// page. If all pages are full, then a new page is allocated.
///
/// When an allocated object is released, it is pushed into it's page's free
/// list. Allocating scans all pages for a free slot.
///
/// # Indexing
///
/// The slab is able to index values using an address. Even when the indexed
/// object has been released, it is still safe to index. This is a key ability
/// for using the slab with the I/O driver. Addresses are registered with the
/// OS's selector and I/O resources can be released without synchronizing with
/// the OS.
///
/// # Compaction
///
/// `Slab::compact` will release pages that have been allocated but are no
/// longer used. This is done by scanning the pages and finding pages with no
/// allocated objects. These pages are then freed.
///
/// # Synchronization
///
/// The `Slab` structure is able to provide (mostly) unsynchronized reads to
/// values stored in the slab. Insertions and removals are synchronized. Reading
/// objects via `Ref` is fully unsynchronized. Indexing objects uses amortized
/// synchronization.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole comment is lovely, thank you <3

@carllerche carllerche merged commit 8feebab into master Aug 12, 2020
@carllerche carllerche deleted the simplified-slab branch August 12, 2020 05:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate C-enhancement Category: A PR with an enhancement or bugfix. M-io Module: tokio/io
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants