Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prost support (i.e. #[global_allocator] or arena allocator) #13

Closed
tarcieri opened this issue Jan 20, 2020 · 4 comments
Closed

Prost support (i.e. #[global_allocator] or arena allocator) #13

tarcieri opened this issue Jan 20, 2020 · 4 comments

Comments

@tarcieri
Copy link
Contributor

tarcieri commented Jan 20, 2020

We'd like to use Protocol Buffers as the serialization format for our "APDUs".

Prost provides a pure Rust implementation of the format, and Prost v0.6 upgraded to bytes v0.5, which added #![no_std] + liballoc support. There's an open PR to add #![no_std] + liballoc to Prost as well:

https://github.com/danburkert/prost/pull/215

If that lands, the simplest option for supporting Prost would be to use liballoc and define a #[global_allocator]. However, bytes was also refactored to (but does not yet fully support / expose) customizable vtables as the backing storage:

tokio-rs/bytes#298

This should allow it to support to support e.g. arena allocators. Since we're trying to support a simple APDU-like RPC protocol (possibly with interrupts and fixed-N concurrent requests), we could probably get by with a very simple arena allocator like [bumpalo](https://github.com/fitzgen/bumpalo (although looks like bumpalo needs liballoc too), since we'd always reclaim all of the allocated memory when a request completes, which seems like it might be a better fit for an RTFM environment (especially if we did eventually want to hit certain realtime deadlines).

@japaric
Copy link
Collaborator

japaric commented Jan 27, 2020

I'd like to check a few things:

  1. are you OK with a heap / global-allocator? Some bare metal applications shy
    away from using a heap because: (a) they add another point of failure (OOM),
    (b) add performance overhead, (c) introduce the problem of memory
    fragmentation and (d) heaps are not real-time friendly.

Those are the "commonly given reasons" for not wanting a heap, but to be fair:

(a) other methods of dynamic memory management like a fixed capacity memory pool
can also run out of memory. However, abstractions like heapless::Pool will let
you handle potential OOM locally (Result-based OOM handling rather than
panic-style OOM handling)

(b) other methods of dynamic memory management also incur performance overhead.
It's probably easier to reason about the overhead of a memory pool than the
overhead of a general purpose allocator, though.

(c) A memory pool hands out fixed-size memory blocks that sit next to each other
in memory so there's no external fragmentation. But the size of the block will
usually exceed what's needed by the application so you end up with internal
fragmentation.

(d) there are heap allocators with constant-time alloc and dealloc
operations out there (see TLSF). However, realloc is, in general, never
constant-time because it may involve a memcpy of the old allocation into the
newly allocated block. IOW, even with the right allocator only fixed-capacity
collections like Box<[T]> (and not Vec, HashMap, etc.) are real-time
friendly.

So, not using heap allocator does not remove these 4 issues. As long as you
use any other form of dynamic memory allocation (memory pools, arenas, etc.)
you'll still have them.

I think a more relevant question here would be if you think a general-purpose
allocator could be a potential attack vector from a security point of view. If
we let the heap allocator manage a fixed amount of memory (i.e. brk & sbrk
are not allowed) then (malicious) OOM should not result in memory corruption
(unless the implementation of the allocator is unsound).

  1. prost and arena allocation

The bumpalo API returns references for each allocation. RTFM message passing
imposes a : 'static bound on all messages: this is to avoid sending stack
allocated values because the sending-task may be finished, and its stack frame
deallocated, before the receiving task is able to dispatch the message. So
using bumpalo will mean not being able to send prost-parsed packets between
tasks, which we probably want to do.

I have not looked at what kind of code prost can generate but if we can make it
generate structures with self-referential references (i.e. pointers into another
field of the struct) then those structs would satisfy the : 'static bound and
it would be possible to send them between tasks.

I'm thinking of structs like this:

// `'self` is made-up syntax that means self-referential
struct Packet {
    // a chunk of memory
    blob: alloc::Box<[u8]>,
    
    // fields specified in the .proto file
    id: u32,
    some_bytes: &'self mut [u8],
    // map allocated in contiguous memory (i.e. not in a tree)
    some_map: IndexMap<'self, u8, u8>,
}

All fields that would normally heap-allocated are allocated in the blob field.
So in a sense, blob is the arena and fields like some_bytes and some_map
are arena-allocated in that blob. When Packet is dropped, only blob is
deallocated and no destructors are run on arena-allocated fields.

blobs would need to be managed by some general-purpose allocator (e.g.
#[global_allocator]) that deals with external fragmentation. Unless, blob
always have the same size because all packets always have the same structure and
size -- in that case a fixed-size memory pool could be used instead of this
memory management scheme.

In any case, I think we should start with plain alloc::{Vec,HashMap}s plus a
#[global_allocator], for flexibility, and then optimize memory management
once we have a better understanding of what the final layout of the prost-parsed
packets will be.

@tarcieri
Copy link
Contributor Author

In any case, I think we should start with plain alloc::{Vec,HashMap}s plus a
#[global_allocator], for flexibility, and then optimize memory management
once we have a better understanding of what the final layout of the prost-parsed
packets will be.

This sounds good to me (however I believe HashMap isn't in alloc I believe because Hash is only in std, but that's fine because we prefer deterministically ordered BTreeMaps anyway).

Longer term: I think all of the data structures we'd want to use for Protobuf "APDUs" could be represented with heapless types. The real problem right now is Prost requires std, has a path to no_std + alloc (https://github.com/danburkert/prost/pull/215), and may potentially be able to work with e.g. heapless if it can provide vtables for bytes (tokio-rs/bytes#298).

As for our longer term needs for alloc: unclear.

@tarcieri
Copy link
Contributor Author

There's one big reason why Protocol Buffers may be a bad fit for our needs: the schema language has no way to express length information about "bytes", e.g. fixed-sized bytestrings or bytestrings with a maximum length, which makes it difficult to describe something like an "APDU" which maps to fixed-sized e.g. heapless::Vec<u8, N> (where N: ArrayLength<u8>).

In the past I've gone down the road of making bytes and prost compatible with no_std + alloc (see the now obsolete microcrates/bytes and microcrates/prost). If we want to try to get prost working I'd definitely prefer upstreaming the changes as opposed to working with a fork (see also prost-amino).

Unfortunately, while bytes now has upstream support for alloc, the vtable isn't otherwise exposed and this seems like a major shortcoming for adding true heapless support.


An alternative is to use a similar schema-driven TLV format, but one designed to work out-of-the-box in heapless environments. I have been toying around with such a format.

I think it'd be ok to use a format like this purely for serialization (e.g. having domain types separate from the encoding objects), with From conversions to convert between the domain and encoding types. That would potentially allow us to use one encoding for APDUs to be decoded/encoded on the USB armory, and regular Protobufs for e.g. a gRPC service running on a (e.g. Linux) host the USB armory is plugged into.

@tarcieri
Copy link
Contributor Author

tarcieri commented Mar 5, 2020

While there's been some upstream movement on prost + alloc, we've managed to make progress using Veriform which is able to use heapless types and therefore has no alloc dependency. So closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants