New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs/coding-guidelines: Add document #2780
Changes from 1 commit
b6157e0
7714a30
3b07e71
2b40afa
995fd8c
cc89ff5
ed4d633
8a50031
0ef322a
412f03e
87c3bb8
69ea476
eb0b75d
f36be85
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,269 @@ | ||||||
# Coding Guidelines | ||||||
|
||||||
<!-- markdown-toc start - Don't edit this section. Run M-x markdown-toc-refresh-toc --> | ||||||
**Table of Contents** | ||||||
|
||||||
- [Coding Guidelines](#coding-guidelines) | ||||||
- [Hierarchical State Machines](#hierarchical-state-machines) | ||||||
- [Conventions for `poll` implementations](#conventions-for-poll-implementations) | ||||||
- [Prioritize local work over new work from a remote](#prioritize-local-work-over-new-work-from-a-remote) | ||||||
- [Bound everything](#bound-everything) | ||||||
- [Channels](#channels) | ||||||
- [Local queues](#local-queues) | ||||||
- [Further reading](#further-reading) | ||||||
- [No premature optimizations](#no-premature-optimizations) | ||||||
- [Keep things sequential unless proven to be slow](#keep-things-sequential-unless-proven-to-be-slow) | ||||||
- [Use `async/await` for sequential execution only](#use-asyncawait-for-sequential-execution-only) | ||||||
- [Don't communicate by sharing memory; share memory by communicating.](#dont-communicate-by-sharing-memory-share-memory-by-communicating) | ||||||
- [Further Reading](#further-reading) | ||||||
|
||||||
<!-- markdown-toc end --> | ||||||
|
||||||
|
||||||
Below is a set of coding guidelines followed across the rust-libp2p code base. | ||||||
|
||||||
## Hierarchical State Machines | ||||||
|
||||||
If you sqint, rust-libp2p is just a big hierarchy of [state | ||||||
machines](https://en.wikipedia.org/wiki/Finite-state_machine) where parents pass | ||||||
Comment on lines
+29
to
+30
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Personally, I am a fan of using a "one-line per sentence" rule for version-controlled markdown documents. Git works on a line-by-line basis and sentences in natural language tend to change as a whole too. Using one sentence per line makes it really easy to use GitHub's suggestions to collaborate on documents like this and diffs are more meaningful too. Just an idea :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But I have to admit I am also in the "use an editor with word wrap enabled"-camp :D There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @mxinden Any comment on this? We don't have to resolve it here. Should I open a separate discussion? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Like this, correct @thomaseizinger? mxinden@ecf86c8 With GitHubs diff highlighting showing what within a line has changed, I don't think there is a great benefit for it. As a downside, I think it adds complexity, e.g. having to explain this to newcomers.
I don't feel strongly about it. It seems like you don't feel strongly about it either. Thus I will proceed without the change. That said, feel free to propose a new pull request with the commit above.
Same here. I even stopped truncating my emails to 80 characters 🤯 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I get to choose, I'd pick one sentence per line. We don't have to enforce it though (I am yet to find a tool that reliably does). Are you happy with doing it on a personal preference basis? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
For markdown files on rust-libp2p? Yes, that is fine by me. |
||||||
events down to their children and children pass events up to their parents. | ||||||
|
||||||
<details> | ||||||
<summary>Code</summary> | ||||||
|
||||||
``` | ||||||
@startuml | ||||||
Swarm <|-- RootBehaviour | ||||||
Swarm <|-- ConnectionPool | ||||||
Swarm <|-- Transport | ||||||
RootBehaviour <|-- PingBehaviour | ||||||
RootBehaviour <|-- IdentifyBehaviour | ||||||
RootBehaviour <|-- KademliaBehaviour | ||||||
|
||||||
Swarm : poll() | ||||||
RootBehaviour : poll() | ||||||
ConnectionPool : poll() | ||||||
Transport : poll() | ||||||
PingBehaviour : poll() | ||||||
IdentifyBehaviour : poll() | ||||||
KademliaBehaviour : poll() | ||||||
@enduml | ||||||
``` | ||||||
</details> | ||||||
|
||||||
``` | ||||||
,------. | ||||||
|Swarm | | ||||||
|------| | ||||||
|poll()| | ||||||
`------' | ||||||
| | ||||||
| | ||||||
,-------------. ,--------------. ,---------. | ||||||
|RootBehaviour| |ConnectionPool| |Transport| | ||||||
|-------------| |--------------| |---------| | ||||||
|poll() | |poll() | |poll() | | ||||||
`-------------' `--------------' `---------' | ||||||
| | ||||||
,-------------. ,-----------------. ,-----------------. | ||||||
|PingBehaviour| |IdentifyBehaviour| |KademliaBehaviour| | ||||||
|-------------| |-----------------| |-----------------| | ||||||
|poll() | |poll() | |poll() | | ||||||
`-------------' `-----------------' `-----------------' | ||||||
``` | ||||||
|
||||||
Using hierarchical state machines is a deliberate choice throughout the | ||||||
rust-libp2p code base. It makes reasoning about control and data flow easy and | ||||||
works well with Rust's `Future` model. The archetecture pattern of | ||||||
hierarchical state machines should be used wherever possible. | ||||||
thomaseizinger marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are there downsides to this design? If so, should we list them? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In my experience, the main downsides are:
Both are a form of complexity that we are trading for correctness and performance which seems typical in Rust land. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I documented the above in 0ef322a. @thomaseizinger I really liked your wording which I mostly adopted. Hope you don't mind. @melekes in case you can think of other downsides, please comment and let's get them included in the discussion and the document. |
||||||
|
||||||
### Conventions for `poll` implementations | ||||||
|
||||||
The `poll` method of a single state machine can be complex especially when that | ||||||
state machine itself `poll`s many child state machines. The patterns shown below | ||||||
have proven useful and should be followed across the code base. | ||||||
|
||||||
``` rust | ||||||
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>{ | ||||||
loop { | ||||||
thomaseizinger marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
match self.child_1.poll(cx) { | ||||||
// The child made progress. | ||||||
Poll::Ready(_) => { | ||||||
// Either return an event to the parent: | ||||||
return Poll::Ready(todo!()); | ||||||
// or `continue`, thus polling `child_1` again. `child_1` can potentially make more progress: | ||||||
continue | ||||||
// but NEVER move to the next child if the current child made progress. Given | ||||||
// that the current child might be able to make more progress, it did not yet | ||||||
// register the waker in order for the root task to be woken up later on. Moving | ||||||
// on to the next child might result in the larger `Future` task to stall as it | ||||||
// assumes that there is no more progress to be made. | ||||||
} | ||||||
|
||||||
// The child did not make progress. It has registered the waker for a | ||||||
// later wake up. Proceed with the other childs. | ||||||
Poll::Ready(_) => { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🙏 2b40afa |
||||||
} | ||||||
|
||||||
match self.child_2.poll(cx) { | ||||||
// As above. | ||||||
} | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it is important to mention that using continue in here will go back to child1 which could make progress again. There is a subtle difference between looping over the entire thing and exhausting each child until it returns Pending. We want the former, not the latter. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed, though not sure how to put it into words. There is this line already: // or `continue`, thus polling `child_1` again. `child_1` can potentially make more progress:
continue Could you make a suggestion @thomaseizinger ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd suggest to just fill in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
As in the same as the above? What would be the benefit of that? Isn't that just duplicating the same comments once more? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wouldn't duplicate the entire explanation but perhaps have a complete example of the control flow and make one comment at the bottom that highlights that we always Maybe also build in the case where the event from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added ed4d633. Let me know what you think. |
||||||
|
||||||
match self.child_3.poll(cx) { | ||||||
// As above. | ||||||
} | ||||||
|
||||||
// ... | ||||||
|
||||||
// None of the child state machines can make any more progress. Each registered | ||||||
// the waker in order for the root `Future` task to be woken up again. | ||||||
return Poll::Pending | ||||||
} | ||||||
} | ||||||
``` | ||||||
|
||||||
### Prioritize local work over new work from a remote | ||||||
|
||||||
When handling multiple work streams, prioritize local work items over | ||||||
accepting new work items from a remote. Take the following state machine as an | ||||||
example, reading and writing from a socket, returning result to its parent: | ||||||
|
||||||
``` rust | ||||||
struct SomeStateMachine { | ||||||
socket: Socket, | ||||||
events_to_return_to_parent: VecDeque<Event>, | ||||||
messages_to_send_on_socket: VecDeque<Message>, | ||||||
} | ||||||
|
||||||
impl Stream for SomeStateMachine { | ||||||
type Item = Event; | ||||||
|
||||||
fn poll_next(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> { | ||||||
// First priority is returning local finished work. | ||||||
if let Some(event) = events_to_return_to_parent.pop_front() { | ||||||
return Poll::Ready(Some(event)); | ||||||
} | ||||||
|
||||||
// Second priority is finishing local work, i.e. sending on the socket. | ||||||
if let Poll::Ready(()) = socket.poll_ready(cx) { | ||||||
todo!("Send messages") | ||||||
} | ||||||
|
||||||
// Last priority is accepting new work, i.e. reading from the socket. | ||||||
if let Some(work_item) = socket.poll_next(cx) { | ||||||
todo!("Start work on new work item") | ||||||
} | ||||||
thomaseizinger marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
} | ||||||
} | ||||||
``` | ||||||
|
||||||
This priotization provides: | ||||||
- Low memory footprint as local queues (here `events_to_return_to_parent`) stay small. | ||||||
- Low latency as accepted local work is not stuck in queues. | ||||||
- DOS defense as a remote does not control the size of the local queue, nor starves local work with its remote work. | ||||||
|
||||||
## Bound everything | ||||||
|
||||||
The concept of unboundedness is an illusion. Use bounded mechanisms to prevent | ||||||
unbounded memory growth and high latencies. | ||||||
|
||||||
### Channels | ||||||
|
||||||
When using channels (e.g. `futures::channel::mpsc` or `std::sync::mpsc`) | ||||||
always use the bounded variant, never use the unbounded variant. When using a | ||||||
Comment on lines
+180
to
+181
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should use clippy's mechanism of banning these functions! https://rust-lang.github.io/rust-clippy/master/index.html#disallowed_methods There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good idea. Though I am failing to have clippy pick up my disallowed-methods = [
{ path = "futures::sync::mpsc::channel", reason = "does not enforce backpressure" },
] @thomaseizinger does this work for you? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It reliably works for me after a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will be introduced with #2823. |
||||||
bounded channel, a slow consumer eventually slows down a fast producer once | ||||||
the channel bound is reached, ideally granting the slow consumer more system | ||||||
resources e.g. CPU time, keeping queues small and thus latencies low. When | ||||||
using an unbounded channel a fast producer continues being a fast producer, | ||||||
growing the channel buffer indefinitely, increasing latency until the illusion | ||||||
of unboundedness breaks and the system runs out of memory. | ||||||
|
||||||
One may use an unbounded channel if one enforces backpressure through an | ||||||
out-of-band mechanism, e.g. the consumer granting the producer send-tokens | ||||||
through a side-channel. | ||||||
|
||||||
### Local queues | ||||||
|
||||||
As for channels shared across potentially concurrent actors (e.g. future tasks | ||||||
or OS threads), the same applies for queues owned by a single actor only. E.g. | ||||||
reading events from a socket into a `Vec<Event>` without some mechanism | ||||||
bounding the size of that `Vec<Event>` again can lead to unbounded memory | ||||||
growth and high latencies. | ||||||
|
||||||
Note that rust-libp2p fails at this guideline, i.e. still has many unbounded | ||||||
local queues. | ||||||
|
||||||
### Further reading | ||||||
|
||||||
- https://en.wikipedia.org/wiki/Bufferbloat | ||||||
- https://apenwarr.ca/log/20170814 | ||||||
- https://twitter.com/peterbourgon/status/1212800031406739456 | ||||||
|
||||||
## No premature optimizations | ||||||
|
||||||
Optimizations that add complexity need to be accompanied with a proof of their | ||||||
effectiveness. | ||||||
|
||||||
This as well applies to increasing buffer or channel sizes, as the downside of | ||||||
such pseudo optimizations is increased memory footprint and latency. | ||||||
|
||||||
## Keep things sequential unless proven to be slow | ||||||
|
||||||
Concurrency adds complexity. Concurrency adds overhead due to synchronization. | ||||||
Thus unless proven to be a bottleneck, don't make things concurrent. As an example | ||||||
the hierarchical `NetworkBehaviour` state machine, run sequentially. It is easy | ||||||
to debug as it runs sequentially. Thus far there has been no proof that | ||||||
shows a speed up when running it concurrently. | ||||||
|
||||||
## Use `async/await` for sequential execution only | ||||||
|
||||||
Using `async/await` does not allow accesing methods on the object being | ||||||
`await`ed unless paired with some synchronization mechanism like an | ||||||
`Arc<Mutex<_>>`. | ||||||
|
||||||
Using `async/await` for sequential execution makes things significantly simpler. | ||||||
|
||||||
Example: Read and once done write from/to a socket. Use `async/await`. | ||||||
|
||||||
``` rust | ||||||
socket.read_exact(&mut read_buf).await; | ||||||
socket.write(&write_buf).await; | ||||||
``` | ||||||
|
||||||
Example: Read and concurrently write from/to a socket. Use `poll`. | ||||||
|
||||||
``` rust | ||||||
loop { | ||||||
match socket.poll_read(cx, &mut read_buf) { | ||||||
Poll::Ready(_) => { | ||||||
todo!(); | ||||||
continue; | ||||||
} | ||||||
Poll::Pending => {} | ||||||
} | ||||||
match socket.poll_write(cx, &write_buf) { | ||||||
Poll::Ready(_) => { | ||||||
todo!(); | ||||||
continue; | ||||||
} | ||||||
Poll::Pending => {} | ||||||
} | ||||||
|
||||||
return Poll::Pending; | ||||||
} | ||||||
``` | ||||||
|
||||||
When providing `async` methods, make it explicit whether it is safe to cancel | ||||||
the resulting `Future`, i.e. whether it is safe to drop the `Future` returned | ||||||
by the `async` method. | ||||||
|
||||||
## Don't communicate by sharing memory; share memory by communicating. | ||||||
|
||||||
The majority of rust-libp2p's code base follows the above Golang philosophy, | ||||||
e.g. using channels instead of mutexes. This pattern enforces single ownership | ||||||
over data, which works well with Rust's ownership model and makes reasoning | ||||||
about data flow easier. | ||||||
|
||||||
### Further Reading | ||||||
|
||||||
- https://go.dev/blog/codelab-share |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GitHub Markdown has a built-in ToC:
It is a bit hidden so I don't know we want to rely on people using it if they want a ToC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat. Though as you say, quite hidden. I suggest we keep the additional ToC, though that is not a strong opinion.