Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add provide_any module to core #3192

Merged
merged 7 commits into from Apr 13, 2022
Merged

Add provide_any module to core #3192

merged 7 commits into from Apr 13, 2022

Conversation

nrc
Copy link
Member

@nrc nrc commented Nov 4, 2021

Rendered


This RFC proposes adding a provide_any module to the core library. The module provides a generic API for objects to provide type-based access to data. (In contrast to the any module which provides type-driven downcasting, the proposed module integrates downcasting into data access to provide a safer and more ergonomic API).

Signed-off-by: Nick Cameron <nrc@ncameron.org>
@nrc nrc added T-libs-api Relevant to the library API team, which will review and decide on the RFC. A-error-handling Proposals relating to error handling. PG-error-handling Project Group: Error handling (https://github.com/rust-lang/project-error-handling) labels Nov 4, 2021
@pachi
Copy link

pachi commented Nov 4, 2021

Add link to Rendered version

@zkat
Copy link

zkat commented Nov 6, 2021

If it helps at all, this looks like it would be amaaaazing for miette. I believe that with this in place, miette could stop defining its own Diagnostic type and "simply" get users to provide various bits of context that it knows about. That means miette and its reporters would be able to use Eyre directly, instead of a bad fork of it :)

@joshtriplett joshtriplett added the I-libs-api-nominated Indicates that an issue has been nominated for prioritizing at the next libs-api team meeting. label Nov 10, 2021
text/0000-dyno.md Outdated Show resolved Hide resolved
text/0000-dyno.md Outdated Show resolved Hide resolved

`provide_any` could live in its own crate, rather than in libcore. However, this would not be useful for `Error`.

`provide_any` could be a module inside `any` rather than a sibling (it could then be renamed to `provide` or `provider`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Living under any rather than in a new top-level module seems like a more natural home for this to me. There's overlap between type tags and type ids, and Any and Provider that I think we could give proper treatment together under the same umbrella.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense! I'm a bit hesitant because type ids and type tags are so obviously similar but have nothing in common in their interfaces or how they are used, so I thought it might be a bit confusing to have them together

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think trying to come up with a module-level doc for a hypothetical combined any module might be a good exercise for identifying those cases where that confusion can come from?


The `TypeTag` trait is an abstraction over all type tags. It does not have any methods, only an associated type for the type which the tag represents. E.g., `<Ref<T> as TypeTag<'a>>::Type` is `&'a T`.

There is no universal type tag. A concrete type tag must be written for a 'category' of types. A few common tags are provided in `provide_any::tags`, including `Value` for any type bounded by `'static`, and `Ref` for types of the form `&'a T` where `T: 'static`. For less common types, the user must provide a tag which implements `TypeTag`; in this way the `provide_any` API generalises to all types.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is Ref not a type tag combinator, where if T has a type tag then Ref can be used as type tag for references to T (somewhat similar to the Option combinator that does seem to exist)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is convenience. I think if Ref was implemented the same way Optional is you'd have to invoke it with foo.get_by_tag::<Ref<Value<MyType>>>(). Thinking about it this doesn't seem too bad, since we already have get_context_ref that handles this case, so we may never need to use the Ref type tag directly as is, which seems like a good justification for updating it to compose with other TypeTags.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One reason is that type tags represent Sized types, so using the compositional form for Ref would mean we lose the ability to represent references to ?Sized types. You can add the ?Sized bound to the Type associated type, but that means a lot of I::Type: Sized bounds all over the place, so it seems sub-optimal.

text/0000-dyno.md Outdated Show resolved Hide resolved
text/0000-dyno.md Outdated Show resolved Hide resolved
@KodrAus
Copy link
Contributor

KodrAus commented Nov 15, 2021

I found a nice usecase for something like this in my day-to-day codebase, which is a storage engine, that I thought I'd share. Individual documents can be cached for better performance. Caching is fine-grained, so for each document we can choose a different caching strategy. We've got a trait for these cache entries that looks like this:

pub trait CacheEntry: Any + Send + Sync + Display {
    /**
    The raw on-disk payload bytes.
    */
    fn bytes_from_disk(&self) -> Option<BytesFromDisk> {
        None
    }

    /**
    The payload bytes that are yielded to callers.

    This may involve decompression or other transformations over the raw on-disk payload.
    */
    fn bytes(&self) -> Option<Bytes> {
        None
    }

    /**
    The approximate size of this entry in-memory.
    */
    fn approximate_size(&self) -> usize;
}

The combination of Any as a supertrait plus some optional getters are a pretty clear smell for Provider. The reason this type is Any at all is so different layers of the storage engine can cache differently shaped things, like a fully deserialized value for frequently accessed values. The Any trait just gives us a single extensibility point that only supports a single specialized usecase at a time, with Provider we'd simply provide another kind of thing a caller can ask for.

Comment on lines 134 to 136
pub trait TypeTag<'a>: Sized + 'static {
type Type: 'a;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub trait TypeTag<'a>: Sized + 'static {
type Type: 'a;
}
pub trait TypeTag: Sized + 'static {
type Type<'a>: 'a;
}

This could be made nicer with GATs, which should be stabilizing soon.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change doesn't seem to make things much nicer, it just moves the lifetime parameter from the type to the associated type, it still needs to be specified in most cases. I think GATs make this harder to understand, and the original is closer to my mental model of a type tag in any case (i.e., the tag includes the lifetime bound of the type it represents as part of its type, rather than the tag represents a type constructor that can be parameterised by any lifetime).

text/0000-dyno.md Outdated Show resolved Hide resolved
text/0000-dyno.md Outdated Show resolved Hide resolved
text/0000-dyno.md Outdated Show resolved Hide resolved
text/0000-dyno.md Outdated Show resolved Hide resolved
text/0000-dyno.md Outdated Show resolved Hide resolved
@m-ou-se m-ou-se removed the I-libs-api-nominated Indicates that an issue has been nominated for prioritizing at the next libs-api team meeting. label Dec 1, 2021
Signed-off-by: Nick Cameron <nrc@ncameron.org>
Signed-off-by: Nick Cameron <nrc@ncameron.org>
@nrc
Copy link
Member Author

nrc commented Dec 6, 2021

RFC updated to take into account comments - thanks for the feedback!

Signed-off-by: Nick Cameron <nrc@ncameron.org>
@nagisa
Copy link
Member

nagisa commented Dec 15, 2021

One thing that has been floating in my mind is – what's the complexity of a function that provides a huge number of different types, and can this be made to be better than O(n) over the number of provide calls (possibly with some help from the compiler)? Can this be done with the API as proposed?

Given the particularities of phf and how match codegen works, I would intuit that it is indeed possible, but implementation in dyno and the shape of the proposed API in general would suggest to me that O(n) is the best possible. Is that true, or am I missing something?

@nrc
Copy link
Member Author

nrc commented Dec 16, 2021

It seems O(n) is the best we can do with this API. However, I don't think n should ever be large, in practice it will be limited by the number of fields (I guess you could be providing a whole bunch of new data, but it seems like you should have a dedicated, named API for that). I couldn't work out a way to have O(1) perf here, though I agree it seems like it should be possible some how.

@yaahc
Copy link
Member

yaahc commented Dec 16, 2021

It seems O(n) is the best we can do with this API. However, I don't think n should ever be large, in practice it will be limited by the number of fields (I guess you could be providing a whole bunch of new data, but it seems like you should have a dedicated, named API for that). I couldn't work out a way to have O(1) perf here, though I agree it seems like it should be possible some how.

Not sure if this is satisfying, but because of the flexibility of the provider API I think you could pretty much always just provide a more performant interface that does have O(1) perf, though ofc with the possibly chained nature on something like the error trait you might still be doing an O(N) lookup of that other trait interface. Admittedly though if that perf is important that seems like a good argument to figure out what that API would look like before stabilizing the current provider API, not after.

text/0000-dyno.md Outdated Show resolved Hide resolved
Signed-off-by: Nick Cameron <nrc@ncameron.org>
```rust
pub mod provide_any {
pub trait Provider {
fn provide<'a>(&'a self, req: &mut Requisition<'a>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this unsound now (after Requisition<'a, '_> -> &mut Requisition<'a> change)?

Since Requisition is actually similar to (TypeId, Option<T>) for some erased T, this theoretically allows writing a safe_transmute by

  1. Requesting type U
  2. In provider replacing request for U with a request for T via mem::swap
  3. In provider satisfying the request with T-value
  4. Getting U from the request, but since the request was changed to a request for T, this reads T as U

I couldn't figure out how to get two requests with the same lifetime, but I'm sure there is a way/it's easy to accidentally allow something like this.

It seems like this API should either

  1. Use Pin<&mut Requisition<'a>>, like the object_provider crate
  2. Use Requisition<'a, '_> and provide a reborrow API to change the '_ and allow delegating

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so because the implementation of request_by_type_tag will use the original Requisition (for U) not the new one (for T) so the user will always get back None.

Note that the old and new types are equivalent, with the lifetimes fully explicit they are Requisition<'a, 'b> and &'b mut Requisition<'a>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't request_by_type_tag call provide with a reference to it's Requisition? I imagined it's implemented in a similar facion:

#[repr(C)]
struct Requisition<'a> {
    id: TypeId,
    ph: PhantomData<&'a ()>,
}

pub fn request_by_type_tag<'a, I: TypeTag<'a>>(provider: &'a dyn Provider) -> Option<I::Type> {
    let mut req = (TypeId::of::<I>(), None::<I::Type>); // this should be a `repr(C)` struct, not a tuple
    let req = transmute::<_, &mut Requisition<'a>>(&mut req);
    provider.provide(req);
    req.1
}

In this example provider can change the TypeId that Requisition checks, by replacing it with another Requisition with a different TypeId, but the slot stays the same.

If I understand correctly, Requisition<'a, 'b> was wrapping *mut inside and so replacing it wouldn't bring inconsistency between TypeId and the slot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see what you mean, sorry.

Implementation is here: https://github.com/rust-lang/rust/pull/91970/files#diff-0752889661748b8a15a597d7156127b6fb90fdeda0627be50e07f3f785bd0f4dR798-R805

Requisition never stores the type id, it is synthesised as needed.

Its not actually possible to create a new Requisition since TagValue is private to the any module. In addition, Requisition is unsized so I think that even if you could create one, you can't assign it into the mutable reference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not actually possible to create a new Requisition since TagValue is private to the any module

If Requisition was Sized, the only thing you'd need would be &mut Requisition, since you could use mem::swap.

Implementation is here: https://github.com/rust-lang/rust/pull/91970/files#diff-0752889661748b8a15a597d7156127b6fb90fdeda0627be50e07f3f785bd0f4dR798-R805

I see. Interesting, thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Requisition was Sized, the only thing you'd need would be &mut Requisition, since you could use mem::swap

I don't understand, how would you create the new Requisition to swap in? Requisition has a TagValue field and you need to name that to create it and thus create the Requisition.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do another request_by_type_tag, but I suppose there's no way to get the lifetimes to match up.

@nrc
Copy link
Member Author

nrc commented Mar 1, 2022

I've significantly reworked the RFC and draft implementations (https://github.com/nrc/provide-any and rust-lang/rust#91970). The main change is to remove type tags from the API. This lets us iterate on that part of the proposal without breaking changes. It also means the surface area of the RFC is much smaller. I have removed handling of mutable references for now (though they are still present in https://github.com/nrc/provide-any so you can see the changes requires) to further shrink the surface area. There are some further changes to the implementation to simplify it and provide better encapsulation of the implementation.

I think this resolves all open questions, though since the RFC is now somewhat different there may be new ones :-)

text/0000-dyno.md Outdated Show resolved Hide resolved
text/0000-dyno.md Outdated Show resolved Hide resolved
@bjorn3
Copy link
Member

bjorn3 commented Mar 1, 2022

The new version makes much more sense to me!

Signed-off-by: Nick Cameron <nrc@ncameron.org>
@yaahc
Copy link
Member

yaahc commented Mar 2, 2022

@rfcbot concern lang-feature

I agree with Mara that it feels like there's a lang feature that might help here.

@rfcbot concern in-rfc-example-for-non-error

I also feel like, if we're going to have a more generalized API with type tags to enable use cases that are not Error, the text of the RFC needs an example that doesn't involve Error, to help motivate that.

@joshtriplett are these concerns resolved now that the type tags have been turned into an implementation detail?

@joshtriplett
Copy link
Member

Yes, I think these have both been resolved, and the new version looks great!

@rfcbot resolved lang-feature
@rfcbot resolved in-rfc-example-for-non-error

@joshtriplett
Copy link
Member

@rfcbot resolved typetag

@TennyZhuang
Copy link

Is it possible to request/provide with an extra const string FIELD_NAME? (may be blocked by const_adt_params) In many cases, there may exists multiple fields with same type in some traits. Although newtype is a good pattern, rust doesn’t has a good support for it now.

@programmerjake
Copy link
Member

Is it possible to request/provide with an extra const string FIELD_NAME? (may be blocked by const_adt_params) In many cases, there may exists multiple fields with same type in some traits. Although newtype is a good pattern, rust doesn’t has a good support for it now.

imho it's better to use a struct type tag, rather than a const str as the disambiguating parameter, since that handles inadvertent typos or name clashes much better.

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Mar 16, 2022
@rfcbot
Copy link
Collaborator

rfcbot commented Mar 16, 2022

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. to-announce and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Mar 26, 2022
@rfcbot
Copy link
Collaborator

rfcbot commented Mar 26, 2022

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

@lygstate
Copy link

lygstate commented Apr 6, 2022

when this can be merged?

@TimDiekmann
Copy link
Member

when this can be merged?

Please note that this is only an RFC, not an implementation.

@yaahc
Copy link
Member

yaahc commented Apr 13, 2022

Huzzah! The @rust-lang/libs-api team has decided to accept this RFC.

To track further discussion, subscribe to the tracking issue1.

Footnotes

  1. https://github.com/rust-lang/rust/issues/96024

@yaahc yaahc merged commit 266cbb9 into rust-lang:master Apr 13, 2022
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 10, 2022
Add the Provider api to core::any

This is an implementation of [RFC 3192](rust-lang/rfcs#3192) ~~(which is yet to be merged, thus why this is a draft PR)~~. It adds an API for type-driven requests and provision of data from trait objects. A primary use case is for the `Error` trait, though that is not implemented in this PR. The only major difference to the RFC is that the functionality is added to the `any` module, rather than being in a sibling `provide_any` module (as discussed in the RFC thread).

~~Still todo: improve documentation on items, including adding examples.~~

cc `@yaahc`
workingjubilee pushed a commit to tcdi/postgrestd that referenced this pull request Sep 15, 2022
Add the Provider api to core::any

This is an implementation of [RFC 3192](rust-lang/rfcs#3192) ~~(which is yet to be merged, thus why this is a draft PR)~~. It adds an API for type-driven requests and provision of data from trait objects. A primary use case is for the `Error` trait, though that is not implemented in this PR. The only major difference to the RFC is that the functionality is added to the `any` module, rather than being in a sibling `provide_any` module (as discussed in the RFC thread).

~~Still todo: improve documentation on items, including adding examples.~~

cc `@yaahc`
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 14, 2023
core/any: remove Provider trait, rename Demand to Request

This touches on two WIP features:

* `error_generic_member_access`
  * tracking issue: rust-lang#99301
  * RFC (WIP): rust-lang/rfcs#2895
* `provide_any`
  * tracking issue: rust-lang#96024
  * RFC: rust-lang/rfcs#3192

The changes in this PR are intended to address libs meeting feedback summarized by `@Amanieu` in rust-lang#96024 (comment)

The specific items this PR addresses so far are:

> We feel that the names "demand" and "request" are somewhat synonymous and would like only one of those to be used for better consistency.

I went with `Request` here since it sounds nicer, but I'm mildly concerned that at first glance it could be confused with the use of the word in networking context.

> The Provider trait should be deleted and its functionality should be merged into Error. We are happy to only provide an API that is only usable with Error. If there is demand for other uses then this can be provided through an external crate.

The net impact this PR has is that examples which previously looked like
```
    core::any::request_ref::<String>(&err).unwramp()
```

now look like
```
    (&err as &dyn core::error::Error).request_value::<String>().unwrap()
```

These are methods that based on the type hint when called return an `Option<T>` of that type. I'll admit I don't fully understand how that's done, but it involves `core::any::tags::Type` and `core::any::TaggedOption`, neither of which are exposed in the public API, to construct a `Request` which is then passed to the `Error.provide` method.

Something that I'm curious about is whether or not they are essential to the use of `Request` types (prior to this PR referred to as `Demand`) and if so does the fact that they are kept private imply that `Request`s are only meant to be constructed privately within the standard library? That's what it looks like to me.

These methods ultimately call into code that looks like:
```
/// Request a specific value by tag from the `Error`.
fn request_by_type_tag<'a, I>(err: &'a (impl Error + ?Sized)) -> Option<I::Reified>
where
    I: tags::Type<'a>,
{
    let mut tagged = core::any::TaggedOption::<'a, I>(None);
    err.provide(tagged.as_request());
    tagged.0
}
```

As far as the `Request` API is concerned, one suggestion I would like to make is that the previous example should look more like this:
```
/// Request a specific value by tag from the `Error`.
fn request_by_type_tag<'a, I>(err: &'a (impl Error + ?Sized)) -> Option<I::Reified>
where
    I: tags::Type<'a>,
{
    let tagged_request = core::any::Request<I>::new_tagged();
    err.provide(tagged_request);
    tagged.0
}
```
This makes it possible for anyone to construct a `Request` for use in their own projects without exposing an implementation detail like `TaggedOption` in the API surface.

Otherwise noteworthy is that I had to add `pub(crate)` on both `core::any::TaggedOption` and `core::any::tags` since `Request`s now need to be constructed in the `core::error` module. I considered moving `TaggedOption` into the `core::error` module but again I figured it's an implementation detail of `Request` and belongs closer to that.

At the time I am opening this PR, I have not yet looked into the following bit of feedback:

> We took a look at the generated code and found that LLVM is unable to optimize multiple .provide_* calls into a switch table because each call fetches the type id from Erased::type_id separately each time and the compiler doesn't know that these calls all return the same value. This should be fixed.

This is what I'll focus on next while waiting for feedback on the progress so far. I suspect that learning more about the type IDs will help me understand the need for `TaggedOption` a little better.
thomcc pushed a commit to tcdi/postgrestd that referenced this pull request Oct 17, 2023
core/any: remove Provider trait, rename Demand to Request

This touches on two WIP features:

* `error_generic_member_access`
  * tracking issue: rust-lang/rust#99301
  * RFC (WIP): rust-lang/rfcs#2895
* `provide_any`
  * tracking issue: rust-lang/rust#96024
  * RFC: rust-lang/rfcs#3192

The changes in this PR are intended to address libs meeting feedback summarized by `@Amanieu` in rust-lang/rust#96024 (comment)

The specific items this PR addresses so far are:

> We feel that the names "demand" and "request" are somewhat synonymous and would like only one of those to be used for better consistency.

I went with `Request` here since it sounds nicer, but I'm mildly concerned that at first glance it could be confused with the use of the word in networking context.

> The Provider trait should be deleted and its functionality should be merged into Error. We are happy to only provide an API that is only usable with Error. If there is demand for other uses then this can be provided through an external crate.

The net impact this PR has is that examples which previously looked like
```
    core::any::request_ref::<String>(&err).unwramp()
```

now look like
```
    (&err as &dyn core::error::Error).request_value::<String>().unwrap()
```

These are methods that based on the type hint when called return an `Option<T>` of that type. I'll admit I don't fully understand how that's done, but it involves `core::any::tags::Type` and `core::any::TaggedOption`, neither of which are exposed in the public API, to construct a `Request` which is then passed to the `Error.provide` method.

Something that I'm curious about is whether or not they are essential to the use of `Request` types (prior to this PR referred to as `Demand`) and if so does the fact that they are kept private imply that `Request`s are only meant to be constructed privately within the standard library? That's what it looks like to me.

These methods ultimately call into code that looks like:
```
/// Request a specific value by tag from the `Error`.
fn request_by_type_tag<'a, I>(err: &'a (impl Error + ?Sized)) -> Option<I::Reified>
where
    I: tags::Type<'a>,
{
    let mut tagged = core::any::TaggedOption::<'a, I>(None);
    err.provide(tagged.as_request());
    tagged.0
}
```

As far as the `Request` API is concerned, one suggestion I would like to make is that the previous example should look more like this:
```
/// Request a specific value by tag from the `Error`.
fn request_by_type_tag<'a, I>(err: &'a (impl Error + ?Sized)) -> Option<I::Reified>
where
    I: tags::Type<'a>,
{
    let tagged_request = core::any::Request<I>::new_tagged();
    err.provide(tagged_request);
    tagged.0
}
```
This makes it possible for anyone to construct a `Request` for use in their own projects without exposing an implementation detail like `TaggedOption` in the API surface.

Otherwise noteworthy is that I had to add `pub(crate)` on both `core::any::TaggedOption` and `core::any::tags` since `Request`s now need to be constructed in the `core::error` module. I considered moving `TaggedOption` into the `core::error` module but again I figured it's an implementation detail of `Request` and belongs closer to that.

At the time I am opening this PR, I have not yet looked into the following bit of feedback:

> We took a look at the generated code and found that LLVM is unable to optimize multiple .provide_* calls into a switch table because each call fetches the type id from Erased::type_id separately each time and the compiler doesn't know that these calls all return the same value. This should be fixed.

This is what I'll focus on next while waiting for feedback on the progress so far. I suspect that learning more about the type IDs will help me understand the need for `TaggedOption` a little better.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-error-handling Proposals relating to error handling. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. PG-error-handling Project Group: Error handling (https://github.com/rust-lang/project-error-handling) T-libs-api Relevant to the library API team, which will review and decide on the RFC. to-announce
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet