Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.1: kv: improve Raft scheduler behavior under CPU starvation #64568

Merged

Commits on May 3, 2021

  1. kv: cap COCKROACH_SCHEDULER_CONCURRENCY at 96

    Relates to cockroachdb#56851.
    
    In investigations like cockroachdb#56851, we've seen the mutex in the Raft
    scheduler collapse due to too much concurrency. To address this, we
    needed to drop the scheduler's goroutine pool size to bound the amount
    of contention on the mutex to ensure that the scheduler was able to
    schedule any goroutines.
    
    This commit caps this concurrency to 96, instead of letting it grow
    unbounded as a function of the number of cores on the system.
    
    Release note (performance improvement): The Raft processing goroutine
    pool's size is now capped at 96. This was observed to prevent instability
    on large machines (32+ vCPU) in clusters with many ranges (50k+ per node).
    nvanbenschoten committed May 3, 2021
    Copy the full SHA
    e1c93b1 View commit details
    Browse the repository at this point in the history
  2. kv: prioritize NodeLiveness Range in Raft scheduler

    Relates to cockroachdb#56851.
    
    In cockroachdb#56851 and in many other investigations, we've seen cases where the
    NodeLiveness Range has a hard time performing writes when a system is
    under heavy load. We already split RPC traffic into two classes,
    ensuring that NodeLiveness traffic does not get stuck behind traffic on
    user ranges. However, to this point, it was still possible for the
    NodeLiveness range to get stuck behind other Ranges in the Raft
    scheduler, leading to high scheduling latency for Raft operations.
    
    This commit addresses this by prioritizing the NodeLiveness range above
    all others in the Raft scheduler. This prioritization mechanism is
    naive, but should be effective. It should also not run into any issues
    with fairness or starvation of other ranges, as such starvation is not
    possible as long as the scheduler concurrency (8*num_cpus) is above the
    number of high priority ranges (1).
    
    Release note (performance improvement): The Raft scheduler now prioritizes
    the node liveness Range. This was observed to prevent instability on large
    machines (32+ vCPU) in clusters with many ranges (50k+ per node).
    nvanbenschoten committed May 3, 2021
    Copy the full SHA
    145b011 View commit details
    Browse the repository at this point in the history