[WIP] Operation pools POC #209

DXist · 2023-09-05T21:22:03Z

This PR is not finished and mostly done to discuss my ideas and get feedback.

I'm exploring monoio for my project and would like to split execution of a single thread application into synchronous and asynchronous parts (sync/async).

In other words I see it as a loop that interleaves between

the sync part which is a deterministic state machine that manipulates inmemory state and runs in discrete ticks
and async part that drives network/storage IO until either all IO work is completed or tick timeout is hit (~10ms)

I see the async part as a single future that always completes within timeout limit.

Besides I need some way to pass execution context between sync and async parts. I don't like approaches with returning BoxFuture's or spawning async tasks because I prefer to do allocations only during application startup.

One option that I started to implement in this PR is to have an API oriented on batched operations similar to IO uring:

synchronously enqueue IO operations with user provided index (user_data) - this index is different from IO Uring user_data and is not sent via the ring
If the ring is full, don't issue submit syscall but queue in the driver - to postpone context switch to asynchronous code
asynchronously wait using some driver-level API function (OpPools::submit?) for at least one of operations in a batch is completed
synchronously process IO results of completed operations
- use the returned user_data index, get context of the application-level operations and
- change the dependent inmemory state
synchronously flush any operations enqueued to the driver but not submitted to the ring

Another idea is to establish configurable in compile time resource limits:

max total number of outstanding IO Uring operations
max number of outstanding operations of each type - for example,
- limit for Accept op is regulated by anticipated maximum number of incoming connections or
- maximum number of inflight read/write operations

I included in the POC AcceptPool structure that

allocates and reuses socketaddr storage for N Accept Ops and
holds the data of Accept operations until user code will process completed results.

Result processing is expected to return the acquired allocated resources to the pool.

Each IO operation could have it's own pool and driver could have configured pools for all operations (see OpPools structure).

I would like to get feedback and check if the project is aligned with the illustrated ideas.

CLAassistant · 2023-09-05T21:22:08Z

All committers have signed the CLA.

ihciah · 2023-09-15T07:13:09Z

Sorry, I can't quite get the problem you want to solve. Now the model is like:

Run sync code instantly, or use spawn_blocking and await it if heavy(this will turn it into async code).
Wrap sync code to async code finally.

Under what circumstances would we want to call asynchronous code from synchronous code?

DXist · 2023-09-15T11:02:24Z

I decided not to use async approach because I don't want to do memory allocation on task::spawn().

I plan to run my appication with callback-based approach like this

// preallocate memory on startup
let mut app_state_machine = AppStateMachine::with_config(config);
let mut io_queue = VecDeque<OpKey>;

let mut finished = false;
while (!finished) {
     finished = app_state_machine.tick(&mut io_queue);
     // app_state_machine can dispatch completions
     io_driver.run_until_timeout(&mut app_state_machine, &mut io_queue, timeout)?; // fail only on fatal errors
}

So I need a lower level IO driver rather than IO runtime. I don't know if monoio wants to expose cross-platform driver but I found an alternative that provides public driver.

ihciah · 2023-09-19T09:06:41Z

Please give some time to investigate it...

nyabinary · 2024-01-03T22:42:46Z

Please give some time to investigate it...

Any updates?

DXist · 2024-01-04T06:26:15Z

I have an update from my side - I've customized completion-style IO driver backed by io_uring/IOCP/kqueue. I integrated it in my application that preallocates memory for IO buffers and IO operation arguments during startup and runs custom event loop.

Summary of customizations:

fixed size submission queue for operations
external runtime could submit an external queue of not yet queued operations as a single batch
timers are exposed as Timeout operation and use suspend-aware CLOCK_BOOTTIME clock source when it's available
file descriptors could be registered to remove refcounting overhead in io_uring
Separate implementations of nonvectored and vectored Recv/Send and RecvFrom/SendTo - to exclude msghdr related overhead
Use send/recv syscalls/opcodes over read/write to remove the associated overhead
Add Read/Write operations for nonseekable files

start POC implementation

efdd8a6

DXist marked this pull request as draft September 5, 2023 21:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Operation pools POC #209

[WIP] Operation pools POC #209

DXist commented Sep 5, 2023 •

edited

CLAassistant commented Sep 5, 2023 •

edited

ihciah commented Sep 15, 2023

DXist commented Sep 15, 2023 •

edited

ihciah commented Sep 19, 2023

nyabinary commented Jan 3, 2024

DXist commented Jan 4, 2024

[WIP] Operation pools POC #209

Are you sure you want to change the base?

[WIP] Operation pools POC #209

Conversation

DXist commented Sep 5, 2023 • edited

CLAassistant commented Sep 5, 2023 • edited

ihciah commented Sep 15, 2023

DXist commented Sep 15, 2023 • edited

ihciah commented Sep 19, 2023

nyabinary commented Jan 3, 2024

DXist commented Jan 4, 2024

DXist commented Sep 5, 2023 •

edited

CLAassistant commented Sep 5, 2023 •

edited

DXist commented Sep 15, 2023 •

edited