Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow specifying arbitrary attributes on the builder struct #237

Merged
merged 15 commits into from Mar 15, 2022

Conversation

ijackson
Copy link
Contributor

@ijackson ijackson commented Mar 7, 2022

In arti, we would like to be able to make the builder Deserialize. That means that we need to specify our custom serde attributes. There may be other use cases.

@TedDriggs
Copy link
Collaborator

I am very conflicted about this PR.

I agree that this is powerful, and that #221 - and probably others - would be able to benefit from it. At the same time, I have two specific fears:

  1. Only attributes that parse to syn::Meta will work here - any other attributes will break the entire crate.
  2. Does this push us too much into the realm of macros obscuring intent? I don't envy the person who has to try and debug a serde impl they can't see.

Can you elaborate a bit on why you need to deserialize into a builder? That might help me get over fear 2.

@ijackson
Copy link
Contributor Author

ijackson commented Mar 9, 2022

Hi.

Thanks for taking a look. I agree that this is powerful and I can see why you want to be careful. I guess my primary response is that I think whether this obscures things too much, or is useful, is a tradeoff that ought to be considered primarily by the project who choose to use this feature (or, indeed, choose not to use it).

Can you elaborate a bit on why you need to deserialize into a builder? That might help me get over fear 2

Of course.

We want this in Arti for the configuration. Making the builder Deserialize will allow us to provide isomorphic configuration settings via a configuration file reader, as we do via builders and setter methods.

Currently we attempt to do this by having the built struct be Deserialize. But this has a number of problems. All the defaults, field renaming, and so on, have to be specified twice. We would also like the configuration (as read from a file) to be Serialize as well, but there are serious problems with trying to round-trip: we would have to undo any compatibility fixups, which (in the general case) is not possible.

Only attributes that parse to syn::Meta will work here - any other attributes will break the entire crate.

I think you are right and this is wrong. It ought to be capable of handling anything that fits into the content of a syn::Attribute. Should I change this so that one writes #[builder(attrs( #[serde(whatever)] )) ?

@TedDriggs
Copy link
Collaborator

TedDriggs commented Mar 9, 2022

I think you are right and this is wrong. It ought to be capable of handling anything that fits into the content of a syn::Attribute. Should I change this so that one writes #[builder(attrs( #[serde(whatever)] ))?

Barring a major change in syn, that won't work.

When darling encounters one of the attributes its been told to parse via #[darling(attributes(XXX))], the first thing it does is call syn::Attribute::parse_meta on that individual attribute. Including the #[ before serde isn't valid in a Meta, so darling would note an error and skip the whole attribute.

That leaves us with three options:

  1. Limit this to accepting Meta-compatible syntax. This would unblock the serde use-case, but it would create failure cases that would feel arbitrary and would have unhelpful error messages.
  2. Make the syntax #[builder(attr = "#[serde(...)]", attr = "#[other_thing(...)]")]. I think this is possible with #[darling(multiple)]. However, it's going to effectively mandate use of raw strings for non-trivial attributes, which is going to get ugly fast.
  3. Have pass-through attributes in a separate top-level attribute, e.g. #[builder_field_attr(serde(...))]. The proc-macro would then claim builder and builder_field_attr, darling_opts::Field would add builder_field_attr to its list of forward_attrs, and the contents of #[builder_field_attr(...)] would then be turned into an attribute on the field.

Of those three, I'm inclined to pursue the third one; it works with any attribute, doesn't require raw strings, and should (I think) be able to preserve span information better for the macro that receives the attributes.

Here's a skeleton of what that would look like:

/// Data extracted from the fields of the input struct.
#[derive(Debug, Clone, FromField)]
#[darling(
    attributes(builder),
    forward_attrs(doc, cfg, allow, builder_field_attr),
    and_then = "Self::unnest_field_attrs"
)]
pub struct Field {
    // ... fields elided ...
}

impl Field {
    /// Remove the `builder_field_attr(...)` packaging around attributes meant for fields
    /// in the builder.
    fn unnest_field_attrs(mut self) -> darling::Result<Self> {
        let mut errors = vec![];

        for attr in self
            .attrs
            .iter_mut()
            .filter(|attr| attr.path.is_ident("builder_field_attr"))
        {
            match unnest_attribute(attr) {
                Ok(new_attr) => *attr = new_attr,
                Err(e) => errors.push(e),
            }
        }

        if !errors.is_empty() {
            return Err(darling::Error::multiple(errors));
        }

        Ok(self)
    }
}

fn unnest_attribute(_attr: &syn::Attribute) -> darling::Result<Attribute> {
    todo!()
}

I took a brief stab at pulling the attribute apart and reassembling it, but couldn't figure out how to do so without a panic if the contents of #[builder_field_attr(...)] were bad.


As an aside, I've called this builder_field_attr in option 3 because #221 is looking for #[builder_setter_attr(...)]. That's not in scope for this PR, but I'd like to avoid creating naming ambiguity, and this way the reader will be able to understand what's going on without checking the derive_builder docs to see what builder_attr means.

@ijackson
Copy link
Contributor Author

ijackson commented Mar 9, 2022

@TedDriggs Thanks for the further review and the very helpful comments. I agree that 3 is the best approach. I will rework this MR accordingly.

@ijackson
Copy link
Contributor Author

I tried the darling(and_then...) but the attributes need to be processed separately for fields, compared to setter methods, etc. I want to avoid making a copy of the attribute list when this feature isn't used, so the manipulated attribute list wants to be Cow. That means putting it in Field is awkward, since it would make Field self-referential.

Having played about, I think the best way to do this is to pass an error vec to Field::as_builder_field, so that it can become "fallible" (or, strictly, have an ability to report errors), and collect the errors in builder_for_struct.

@TedDriggs
Copy link
Collaborator

TedDriggs commented Mar 10, 2022

and_then should be taking ownership of the whole Field, so you're not copying anything; you'll do an in-place mutation of the array to swap out #[builder_field_attr...] attributes for their unnested equivalents in-place.

I have a strong preference for doing all validation upfront, so that when from_derive_input returns Ok we are highly confident code generation will work.

@ijackson
Copy link
Contributor Author

and_then should be taking ownership of the whole Field, so you're not copying anything; you'll do an in-place mutation of the array to swap out #[builder_field_attr...] attributes for their unnested equivalents in-place.

Well, yes, but Field would want to contain:

  attrs: Vec<Attribute>,
  field_attrs: Cow<&[Attribute]>,

where if the nested attribute feature isn't used, field_attrs = Cow::Borrowed(attrs). But that's not possible because that would make Field a self-referential struct. This occurs in any situation where this "perhaps a separate value, but perhaps a reference to some other part of the options" is contained within a single Rust value along with the other options, which is necessarily the case if it's all just the return value from from_derive_input.

TBH, looking at the existing code, my first thought of where to add the necessary additional processing was in the construction of FieldWithDefaults. I(t could contain a Cow referencing parts of Field. But FieldWithoutDefaults is constructed late and infallibly. Making its construction fallible (or at least, giving it access to &mut errs) would solve that but, without other changes, would involve errors arising later.

The alternatives would seem to be:

  • Never do this reference trick, and instead always copy things even when perhaps it's not needed.

  • Use Rc.

  • Do this later in the construction of FieldWithDefaults or even later in one of its methods.

  • Something like

    let opts = macro_options::RawOptions::from_derive_input(&ast)?; // from darling
    let opts: DefaultedAndProcessedOptions<'_> = opts.resolve()?; // can reference RawOptions
    // presumably `DefaultedAndProcessedOptions` contains `Vec<FieldWithDefaults<'a>>`.

Given what you've just said, I think possibly DefaultedAndProcessedOptions would be the right way to go about it? That's a bit similar to what you have already done with FieldWithDefaults.

Anyway, I'm happy to do it whatever way you prefer so please let me know. Thanks for your attention.

@ijackson
Copy link
Contributor Author

Or ossibly

   let opts = macro_options::Options::from_derive_input(&ast)?; // from darling
   let fields: Vec<FieldWithDefaults<'_>> = opts.resolve_fields()?; // can reference `opts`

and then in the rest, use fields rather than opts.fields.

@TedDriggs
Copy link
Collaborator

Ah, I see the issue now. I think a little excess cloning is an acceptable tradeoff for keeping this change more self-contained. If it turns out to be a major efficiency hit later, there are ways we could optimize it, such as keeping a separate field that marks which attribute indices are field, setter, or both.

@ijackson
Copy link
Contributor Author

ijackson commented Mar 11, 2022

Ah, I see the issue now. I think a little excess cloning is an acceptable tradeoff for keeping this change more self-contained.

OK. I have now done this, and force pushed. (It didn't seem to make much sense to read it as an incremental change.)

While I was there I added builder_setter_attrs too.

If it turns out to be a major efficiency hit later, there are ways we could optimize it, such as keeping a separate field that marks which attribute indices are field, setter, or both.

I haven't done any profiling. I'm not sure how we would discover the marginal effect of this. Personally , having slept on it,I would have thought that the borrowing-capable argument processing (along the lines of my suggestion in #237 (comment)) could be made self-contaned enough, . That's still concentrating the argument processing right at the start. But obviously I will defer to your opinion on this question of taste and I didn't want to get this MR blocked on that. If you like, I could try that out in a separate MR for you to see how it turns out.

Copy link
Collaborator

@TedDriggs TedDriggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't anticipated putting multiple attributes inside the outer container like this, but as long as rustfmt can handle it then I'm content with the approach. Thanks for digging so deeply into this.

derive_builder/tests/compile-fail/builder_setter_attrs.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@TedDriggs TedDriggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per comment thread, let's switch to #[builder_field_attr(serde(...))].

I briefly considered the following alternate syntax:

#[builder_attr(field, serde(...))]

However, I think the single ident for the attr is sufficient, and I didn't like the risk of rustfmt putting field on its own line in that case.

@ijackson
Copy link
Contributor Author

OK, thanks. Will get back to you today.

Right now we just clone and drain Field::attrs.  But these values are
going to be distinct in a moment.
This is hard to demonstrate without importing some other crate whose
attributes can be used.  Choose serde.
This makes the example better and the test more thorough.
I couldn't think of a good example.
@ijackson
Copy link
Contributor Author

OK, made the syntax change. I chose to put it on as a commit on top, rather than squash it in.

I have rebased the branch onto current master.

Copy link
Collaborator

@TedDriggs TedDriggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting closer - I don't think we want to allow multiple attributes in a single #[builder_field_attr(...)] attribute under this syntax. Limiting it to one makes it more like cfg_attr, and I'm skeptical it adds much verbosity - hopefully people aren't adding that many distinct proc-macros to their builders.

derive_builder/tests/compile-fail/builder_field_attrs.rs Outdated Show resolved Hide resolved
derive_builder/tests/compile-fail/builder_setter_attrs.rs Outdated Show resolved Hide resolved
derive_builder/src/lib.rs Outdated Show resolved Hide resolved
@ijackson
Copy link
Contributor Author

Getting closer - I don't think we want to allow multiple attributes in a single #[builder_field_attr(...)] attribute under this syntax. Limiting it to one makes it more like cfg_attr, and I'm skeptical it adds much verbosity - hopefully people aren't adding that many distinct proc-macros to their builders.

You are completely right. In fact, the code can only accept one attribute because it strips a ( ) and replaces that with #[ ]. But I didn't clean up all the plurals, Vec, etc. I will do that.

@ijackson
Copy link
Contributor Author

Pushed. Again, I didn't squash anything, not even fixup for the botched merge which I just spotted. Every commit ought to pass the tests still so it shouldn't break bisectability...

@ijackson ijackson requested a review from TedDriggs March 15, 2022 19:43
Copy link
Collaborator

@TedDriggs TedDriggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful work.

@TedDriggs TedDriggs merged commit f0aebdd into colin-kiegel:master Mar 15, 2022
@TedDriggs
Copy link
Collaborator

Between this and the other changes you've put in, we have enough to merit a publication. Version 0.11.0 will be available once CI passes.

@ijackson
Copy link
Contributor Author

Thanks :-).

@ijackson ijackson deleted the builder-field-attrs branch March 16, 2022 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants