Workload Partitioning #380

jshook · 2021-11-10T18:57:12Z

jshook
Nov 10, 2021
Maintainer

Abstract

NoSQLBench workloads are designed to be partitionable. Using the whole workload conception, there is a cohesive and complete definition for a given workload, given the statements, bindings, and cycle range over which it runs.

This feature has been used by testers for advanced scenarios or for driving high concurrency. Yet, it is sub-surface and implied in many scenarios rather than made part of the explicit machinery of distributing tests. While it is still true that many workloads can be easily tested using a single client, server technology is advancing enough that load banking multiple clients is becoming more common. k8s and k8ssandra style deployments require updated deployment methods for co-located client tooling. Metrics and operational views need to be aggregated in a logically consistent way. It is time to streamline and simplify the concepts of workload partitioning to simplify all of these scenarios. Automatic workload partitioning is only a piece of the approach to address these needs, but it is the first and most fundamental.

Goals

Streamline and codify the mechanism by which workloads can be easily partitioned.
Create a set of partitioning methods which can take a command or set of commands and render them into piece-wise commands automatically.
Make these methods optional, but clear and useful enough that they are preferred.

Non-Goals

This is not an attempt to solve the need to distribute or partition workloads. This is already possible, as intended with the underlying building blocks.
This is not an attempt to remove options for current workload partitioning methods.

Proposal

Splitting By Cycles

As workloads are completely defined by the activity parameters, operations, bindings, and cycle range, workload partitioning can simply be the act of partitioning these parameters into enumerable parameter sets. For example:

nb run driver=foo workload=bar cycles=1000

can be partitioned into two different workloads as

nb run driver=foo workload=bar cycles=500
nb run driver=foo workload=bar cycles=500..1000

These be run concurrently on two different systems in order to offer twice the workload from the client perspective. Yet, in this mode, essential details which are definitive for the testing scenario are implied at best.

This could be triggered simply by providing the following parameter

... split_cycles=3/7 ...

In other words, specifying the above parameter would cause the cycles parameter to be recalculated in-place, with accompanying diagnostics to explain.

One serious caveat to this method is that sequences of dependent operations (stanzas) may be split on cycle boundaries. Normal dispatch to a thread includes micro-batching by stride (AKA sequence length). To support this method robustly without breaking apart stanzas, the boundary of the split must fall directly on a stride boundary. Defensive usage patterns would need to be implemented to protect against this and/or provide a finesse to work around it.

An alternate proposal is to explicitly split by stride boundary:

... split_strides=3/7 ...

In this case, it is clear that the workload will be split on a stride boundary. There is no need for the partitioning logic to do any more than ensure that the splits are moved to the next stride boundary.

Splitting By Precursor Function

In some trivial cases, the split by cycles method is sufficient, yet it does gloss over some controls testers sometimes use for advanced scenarios.

Bindings are effectively functions which address into a virtual space of possible values. They are chained together to sub-select from this space. The initial functions on a binding, called the precursor functions can map all of the operations in a workload into a different virtual space. Take these examples:

    scenarios:
      default: run driver=stdout
    bindings:
      value: Identity()
      evens: Mul(2L)
      odds: Mul(2L); Add(1L);
      div2: Div(2L);
      steps1: Expr('((cycle/2)*2)+cycle')
      steps2: Expr('((cycle/2)*2)+2+cycle')

This produces the output:

value=0 evens=0 odds=1 div2=0 steps1=0 steps2=2
value=1 evens=2 odds=3 div2=0 steps1=1 steps2=3
value=2 evens=4 odds=5 div2=1 steps1=4 steps2=6
value=3 evens=6 odds=7 div2=1 steps1=5 steps2=7
value=4 evens=8 odds=9 div2=2 steps1=8 steps2=10

... which provides a basic mathematical basis for dividing the cycles over sub-intervals on some stride. Notice that the steps1 and the steps2 sequences are the sum of of a step function, an offset, and the cycle. In other words, the cycle interval has been broken into chunks. This should be implemented in a Step(<stride>,<client_index>,<client_count>) function for easier use.

With a binding prefix function like Step(...), two questions arise:

How do we inject the node details into the function definition?
How do we inject the stride into the function definition?

For 1, a simple method of including env vars into a YAML can be supported. This is simply an extension of the existing work on support env vars, applied to the workload loader.

For 2, a decorator behavior exists for functions to ask for external configuration data. This simply needs to be extended to support the safe intersection of data coming from possibly more than one place, with ensemble data provided across the board. Only the Step(...) function need care about this data at initialization time.

The split-by-cycles and the split-by-bindings methods are not mutually exclusive. In fact, the split-by-bindings method effectively requires the help of the cycle splitting logic to function correctly.

As such, split_bindings=... implies split_cycles=..., and thus these two parameters constitute two different levels of workload pre-processing.

Why Both?

Data bindings allow us to ascribe content to operations in a test-worthy way. Mechanically, they use the cycle number as a seed value to provide a deterministic operation for any given cycle. This makes the interval of cycles an essential part of the test definition, although only a part. The bindings provide the other part which is needed to generate test content.

A binding which maps a cycle monotonically to an identifier in a dataset has a very different effect on access patterns than one which is pseudo-random across the address space of that dataset.

bindings:
  monotonic: Mul(5L)
  prng: Hash()

In the example above, monotonic is advancing an identifier by 5 for each cycle as a long value. Access patterns that propagate through the layers of the system in this case will be read-ahead friendly, and, as such, may be an explicit part of a test design. This holds strictly when there is a single client controlling the active region of access. If you split this workload into 10 parts and assign them to individual client nodes, even if you maintain the same concurrency and rate, performance dynamics will vary. As such, it is effectively a different test definition.

We can not expect to avoid this side-effect of partitioning. The primary goal in this proposal is to surface the details in a self-consistent way that can be taken for granted once the pattern is established. When and where partitioning occurs should be evident, as well as how to do it with the addition of a few standardized parameters.

This illustrates directly the reason for using binding function prefixes. With this method, concurrent clients may act more in-concert with respect to the active region of a test, and by default will be the preferred method. If we are forced to pick a method, this is the most universally consistent behavior. It is effectively like spreading concurrent work over a larger thread pool that spans multiple systems.

Usage Invariants

For every scenario, the client enumeration shall be provided to any bindings or scripting layers. This can come from parameters provided by a user, or it can come from environment variables, or it can come from the default of "1/1", in that order. In effect, every nosqlbench process is now presumed to be one of multiple -- "1/1" in the default case.

When more than one method of workload splitting is prescribed, a warning shall be given to the user.

When cycles are assigned from a logical workload definition (The command including parameters, cycles, workload, etc), there shall be exactly the same number of active cycles as before, summed over the individually projected versions of the workload. Corollary: It should be obvious from looking at a projected test definition how to transform it back into the previous definition.

Integration

To facilitate integration with other systems which may not provide workload splitting parameters directly via command line, environment variables should be supported.

For example, specifying ... split_bindings=${ENV_VAR} ... should already work when interpreted through a shell layer, which is often the case. However, some integrations do not have this layer and injecting an additional shell layer just for this purpose is onerous. To facilitate this, the environment variables should be supported in the following cases:

From within YAML files, at the same point at which template variables are expanded, but using standard shell syntax of $word or ${...}.
From within parameters on the command line, limited to the stricter syntax (${...}). This is effectively, but not mechanically the same as using a shell environment to do the variable interpolation. It will allow for direct invocation of jobs and processes without a complicating and potentially wasteful intermediary layer. Also, the more strict version is very unlikely to overlap in practice with any other actual intended behavior.

farrell0 · 2021-11-10T19:04:34Z

farrell0
Nov 10, 2021

On the topic of injection, and ENV Vars .. I've recently grown very fond of Node/js/REACT and Python .properties files.

If we had support for those .. ..

1 reply

jshook Feb 9, 2023
Maintainer Author

This seems like a reasonable thing to add. Added #1076

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workload Partitioning #380

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Workload Partitioning #380

jshook Nov 10, 2021 Maintainer

Abstract

Goals

Non-Goals

Proposal

Splitting By Cycles

Splitting By Precursor Function

Why Both?

Usage Invariants

Integration

Replies: 1 comment · 1 reply

farrell0 Nov 10, 2021

jshook Feb 9, 2023 Maintainer Author

jshook
Nov 10, 2021
Maintainer

Replies: 1 comment 1 reply

farrell0
Nov 10, 2021

jshook Feb 9, 2023
Maintainer Author