Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concerns about use of rustdoc #97

Closed
WaffleLapkin opened this issue Aug 22, 2022 · 6 comments
Closed

Concerns about use of rustdoc #97

WaffleLapkin opened this issue Aug 22, 2022 · 6 comments

Comments

@WaffleLapkin
Copy link
Sponsor

Hi! I have some concerns about use of rustdoc for non-documentation purposes. In my experience rustdoc regularly has regressions/bugs (a recent example w/ my crate: rust-lang/rust#100204; a long standing bug: rust-lang/rust#83375) (I'm sorry, rustdoc team ;-;) which make it a questionable choice for correctness checks like this crate.

A better (in my opinion anyway) way of getting information about crates would be to directly call compiler APIs. The drawback though is that compiler APIs are unstable, so then this tool would need to be shipped with the compiler like clippy or rustdoc that are linked with a specific version of the compiler (I'm not sure how doable this is). And of course, there is an implementation cost, someone needs to do this.

@epage
Copy link
Collaborator

epage commented Aug 22, 2022

FYI Besides the readme, there is also some discussion about data sources at rust-lang/cargo#374 (comment)

A better (in my opinion anyway) way of getting information about crates would be to directly call compiler APIs. The drawback though is that compiler APIs are unstable, so then this tool would need to be shipped with the compiler like clippy or rustdoc that are linked with a specific version of the compiler

cargo-semvever takes an approach similar to this but never got to the point of being able to be shipped by rustup. Until a tool is shipped in rustup, rustdoc provides the widest range of nightly versions to support, making it easier to adopt even if there are accuracy issues. Something that helps is cargo-semver-check has some decoupling between the data source and the written checks, allowing the data source to be changed without a complete rewrite.

#61 is tracking the work we expect needs to be done before being able to ship this via rustup. One route for possibly including it earlier is getting it marked as a "preview" component (I think it was rustfmt that took this route?). I'm not sure what all bar would be required for that but we'd to demonstrate enough commitment that this would be continued and not abandoned and left to rot.

@aDotInTheVoid
Copy link

aDotInTheVoid commented Aug 24, 2022

I think you actually need to call into the compiler (or duplicate significant functionality) to be fully accurate

consider changing

pub fn demo(x: ConcreteType)

to

pub fn demo(x: impl Trait)

This is not breaking iff ConcreteType implements Trait. But rustdoc-json doesn't say if a type implements a trait, only what impl items exits for a given type/trait. Converting from a list of impl items, to being able to say if/how a type implements a trait is a hard problem, and to use it you'd need to depend on the compiller (or chalk), or spend alot of effort reimplementing chalk.

Another example of how involved these can get:

pub struct S<T: Trait1>{ ... }
pub trait Trait1: Trait2 { ... }
pub trait Trait2 { ... }
pub trait Trait3 { ... }

impl<T: Trait2> Trait3 for S<T> { ... }
impl Trait1 for i32 { ... }

// v1.1
pub fn demo(x: S<i32>) { ... } 
// v1.1
pub fn demo(x: impl Trait3) { ... }

@obi1kenobi
Copy link
Owner

obi1kenobi commented Aug 24, 2022

I agree that there are cases where rustdoc information is insufficient, and I've said as much myself specifically in regards to this project. Our approach doesn't currently require "fully accurate," it requires "no false positives." There are easily dozens if not hundreds of semver rules, and we currently can't scan for them all anyway. This impl trait example is just another one we can't currently scan for.

The underlying query engine, Trustfall, is designed to make it easy to plug in additional data sources down the road, whether those are rustc, rust-analyzer, chalk, or something else which would allow us to cover this and other cases in the future. In the meantime, there are plenty of semver breaks that are possible to catch with rustdoc JSON alone. We could go to the compiler (or other tools) for those cases, but other semver linters have found drawbacks in such approaches which to my eyes seem more severe than what we're facing here.

I also wouldn't be at all surprised if it turns out that most semver-breaks that happen in practice are the simpler kinds that aren't caused by generics / impl trait / complex uses of lifetimes etc. In my mind, it would just be an analogue of "most injuries on a ski mountain happen on the easiest trails."

P.S.: I recently gave a 10min talk on Trustfall at a conference, which should help substantiate my claim that it wouldn't be difficult to integrate compiler-provided info into this tool's semver queries: https://predr.ag/talks/#how-to-query-almost-everything

@llogiq
Copy link

llogiq commented Aug 30, 2022

A different solution (albeit possibly equaly volatile) would be to use rust-analyzer's backend. It works on stable Rust, is packaged in crates and has type & trait information. If needed, you could pin a version and work on top of that.

@epage
Copy link
Collaborator

epage commented Aug 30, 2022

I asked matklad about rust-analyzer and he felt it wasn't a good fit. In general, I recommend people also check out that thread as we talk about different data sources.

@obi1kenobi
Copy link
Owner

With almost a year's worth of hindsight, I think rustdoc has served us well. We've been able to work closely with the rustdoc team in making both of our tools better, and we have every reason to believe the fit will be even better over time.

Thanks for the discussion, everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants