Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move Builders to by-value #191

Closed
seanmonstar opened this issue Mar 19, 2018 · 9 comments
Closed

Move Builders to by-value #191

seanmonstar opened this issue Mar 19, 2018 · 9 comments
Labels
A-request Area: Request A-response Area: Response A-uri Area: Uri and parts B-breaking Blocked: breaking change. This requires a breaking change. S-feature Severity: feature. This adds something new.
Milestone

Comments

@seanmonstar
Copy link
Member

I propose that http::request::Builder and http::response::Builder should be changed to be by-value builders, instead of by-ref as they currently are. This may be a little controversial, but I believe there's good reason to do so, and my proposal still includes ways to mutate builders with &mut self for cases that want it.

Motivation

  • Ergonomics:
    • Using by-ref builders is easy when you can write all method calls in a single chain, but as soon as the chain must be broken, it must be broken into at least 3 pieces: let mut b = Request::builder(), b.chain().chain();, and then b.chainMore().build().
    • Creating a builder, and setting initial properties, before passing it as a return value or to a function requires you to break up the chain as mentioned in the previous point, since you must get the original value from builder() to be able to pass it anywhere by value.
  • Correctness:
    • Using by-ref builders means that calling build leaves behind a mine in the builder slot, since the builder must consume internal state to build. That means the compiler will not catch any attempts to use the builder after a call to build.
    • Passing &mut Builder to a function can actually consume the internal builder state, which can result in unexpected panics after having passed it to a modifier where the assumption was it was just set some properties.

Improvements

  • Ergonomics:
    • With a by-value builder, it is easy to stop the chain at any point and pass the Builder by value to another function or return it. It is not required to break up the chain to appease the type system.
  • Correctness:
    • If a user has a Builder, they know for certain that it is not a consumed shell that might panic if used more. Using the builder to build requires ownership of it, so supposed modifier functions cannot accidentally consume it.

But conditionally chaining on by-value is annoying

First, I'd offer that if we have to pick between two annoying things, dealing with a by-value builder is the less annoying of the two. Compare:

// by-ref
let mut builder = Request::builder();
builder
    .uri("/foo/bar")
    .method("POST");

if body_len != 0 {
    builder.header("content-length", body.to_string());
}

builder
    .body(body)
    .unwrap()

VS

// by-value
let mut builder = Request::builder()
    .uri("/foo/bar")
    .method("POST");

if body_len != 0 {
    builder = builder.header("content-length", body.to_string());
}

builder
    .body(body)
    .unwrap()

However, as I said in the beginning, we can add methods that mutate on &mut self, to allow that pattern for those who want it. Basically, every builder method that mutates self by-value, like uri, we could add set_uri(&mut self, ...) -> &mut Self as a mirror.

These by-ref mutators could be chained as well. The only pain is that they couldn't be chained into body(), since that would require by-value. However, I've provided above while I feel this is better.

Alternative

We could optionally have the by-ref mutates keep the same name, and make the by-value methods have a with_ prefix. However, this would end up as a bigger breaking change, since all existing code could no longer call body() on their chains.

Breaking Changes

If we switch to by-value without the prefix, the breaking changes would be these:

  • Using methods when receiving a &mut Builder would need to be changed to the set_ versions.
  • Anyone (ab)using being able to construct a the item from a &mut Builder would no longer be able to do so.

However, anyone that has the full chain together should not see any breaking change.

@seanmonstar seanmonstar added this to the 0.2 milestone Mar 19, 2018
@carllerche
Copy link
Collaborator

Personally, I don't have a strong opinion between &mut self and self builder styles. However, I personally prefer &mut self.

I disagree with the ergonomics points. While it requires less lines of code, as you pointed out, being forced to assign the new builder back to the original variable always felt weird to me. Also, when writing fns that mutate the builder, one now always has to include the builder as a return value.

That said, I personally don't care enough about the differences between the two styles. What I do care about is:

  • We should stick with idiomatic Rust unless there is a good reason not to.
  • We should have one way of doing things unless there is a good reason not to.

The Rust API guidelines recommend &mut self. I think this is the summary as to why.

I also don't believe that stylistic preference is enough of an argument to support providing two separate idioms.

@sagebind
Copy link
Contributor

I find the guidelines on builders a bit too strict. When I write builders, I tend to follow this mental guideline:

  • If the builder can be re-used after building one instance (basically a template): &mut self builder. Example: std::thread::Builder would make more sense to take &mut self instead of self.
  • If the builder has some special state that cannot be re-used after building: self builder. Example: I have a game engine that uses this for building an engine instance, which includes various system resources in the builder. I think the request and response builders as-is make sense with this approach, since the body is non-reusable.

I don't see a big problem with using different approaches depending on the situation, though generally I agree we should weigh very carefully whether the benefits of alternate approach outweighs breaking conventions.

@seanmonstar
Copy link
Member Author

I disagree with the ergonomics points.

While I actually think the correctness points are more important, I think the ergonomic gains are a net win, since we can actually remove the weirdness.

being forced to assign the new builder back to the original variable always felt weird to me

I think we can fix this weirdness by include by-ref set_* methods. For example:

// by-value
let mut builder = Request::builder()
    .uri("/foo/bar")
    .method("POST");

if body_len != 0 {
    builder.set_header("content-length", body.to_string());
}

builder
    .body(body)
    .unwrap()

when writing fns that mutate the builder, one now always has to include the builder as a return value.

Another reason we should add by-ref set_* methods.

The Rust API guidelines recommend &mut self. I think this is the summary as to why.

As I pointed out in a comment in the same issue, I'm convinced the guidelines are wrong. Additionally, the guidelines aren't cared for frequently, and so tend to just kind of whither.

In the case where builders can be reused, I think build(&self) is the right choice. Things like the handshake builders in h2, since you can build up some settings and then use it to handshake several connections, should use a by-ref builder.

In the case of http's request and response builders, this is not the case. They cannot be reused, and you have no way of even protecting against a library mutator that takes &mut Builder from consuming the item out from under you. You'd need to wrap calls with catch_unwind, since it panics unconditionally.

@carllerche
Copy link
Collaborator

Given the size of Request, is switching to a by-value builder guaranteed to maintain similar asm? There definitely could be a negative perf impact if the generated code performed a ton of moves (that said, I would be surprised if it did...).

@seanmonstar seanmonstar added S-feature Severity: feature. This adds something new. A-request Area: Request A-response Area: Response B-breaking Blocked: breaking change. This requires a breaking change. labels Oct 18, 2018
@seanmonstar
Copy link
Member Author

The uri::Builder from #219, which followed the by-ref style to consistent, should be included in this change for 0.2.

@seanmonstar seanmonstar added the A-uri Area: Uri and parts label Oct 18, 2018
@carllerche
Copy link
Collaborator

One advantage with moving the core types to a new crate, we could bump the builders sooner.

@Ivshti
Copy link

Ivshti commented Jun 10, 2019

@seanmonstar wouldn't a by-value builder also eliminate the need for .unwrap(), as the type system ensures that build can only be called once?

I find this to be a big ergonomics issue in the current builder

@JeanMertz
Copy link

JeanMertz commented Nov 4, 2019

usage report: I ran into an error in my production application which returned the cannot reuse response builder error.

Luckily, using backtraces it was relatively easy to find out the cause was a missing return in a long if/else chain of different server responses being constructed from a pre-defined builder.

But yes, I can concur that it would have been nice if this was detected at compile time by making the builder consume itself.

edit ah wait, I see this was merged in #332, looking forward to an eventual 0.2 release, thanks for this change 👍

@seanmonstar
Copy link
Member Author

This has been done in 0.2.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-request Area: Request A-response Area: Response A-uri Area: Uri and parts B-breaking Blocked: breaking change. This requires a breaking change. S-feature Severity: feature. This adds something new.
Projects
None yet
Development

No branches or pull requests

5 participants