Concerns about use of `rustdoc` #97

WaffleLapkin · 2022-08-22T11:03:54Z

Hi! I have some concerns about use of rustdoc for non-documentation purposes. In my experience rustdoc regularly has regressions/bugs (a recent example w/ my crate: rust-lang/rust#100204; a long standing bug: rust-lang/rust#83375) (I'm sorry, rustdoc team ;-;) which make it a questionable choice for correctness checks like this crate.

A better (in my opinion anyway) way of getting information about crates would be to directly call compiler APIs. The drawback though is that compiler APIs are unstable, so then this tool would need to be shipped with the compiler like clippy or rustdoc that are linked with a specific version of the compiler (I'm not sure how doable this is). And of course, there is an implementation cost, someone needs to do this.

The text was updated successfully, but these errors were encountered:

epage · 2022-08-22T12:23:48Z

FYI Besides the readme, there is also some discussion about data sources at rust-lang/cargo#374 (comment)

A better (in my opinion anyway) way of getting information about crates would be to directly call compiler APIs. The drawback though is that compiler APIs are unstable, so then this tool would need to be shipped with the compiler like clippy or rustdoc that are linked with a specific version of the compiler

cargo-semvever takes an approach similar to this but never got to the point of being able to be shipped by rustup. Until a tool is shipped in rustup, rustdoc provides the widest range of nightly versions to support, making it easier to adopt even if there are accuracy issues. Something that helps is cargo-semver-check has some decoupling between the data source and the written checks, allowing the data source to be changed without a complete rewrite.

#61 is tracking the work we expect needs to be done before being able to ship this via rustup. One route for possibly including it earlier is getting it marked as a "preview" component (I think it was rustfmt that took this route?). I'm not sure what all bar would be required for that but we'd to demonstrate enough commitment that this would be continued and not abandoned and left to rot.

aDotInTheVoid · 2022-08-24T16:52:28Z

I think you actually need to call into the compiler (or duplicate significant functionality) to be fully accurate

consider changing

pub fn demo(x: ConcreteType)

to

pub fn demo(x: impl Trait)

This is not breaking iff ConcreteType implements Trait. But rustdoc-json doesn't say if a type implements a trait, only what impl items exits for a given type/trait. Converting from a list of impl items, to being able to say if/how a type implements a trait is a hard problem, and to use it you'd need to depend on the compiller (or chalk), or spend alot of effort reimplementing chalk.

Another example of how involved these can get:

pub struct S<T: Trait1>{ ... }
pub trait Trait1: Trait2 { ... }
pub trait Trait2 { ... }
pub trait Trait3 { ... }

impl<T: Trait2> Trait3 for S<T> { ... }
impl Trait1 for i32 { ... }

// v1.1
pub fn demo(x: S<i32>) { ... } 
// v1.1
pub fn demo(x: impl Trait3) { ... }

obi1kenobi · 2022-08-24T17:21:24Z

I agree that there are cases where rustdoc information is insufficient, and I've said as much myself specifically in regards to this project. Our approach doesn't currently require "fully accurate," it requires "no false positives." There are easily dozens if not hundreds of semver rules, and we currently can't scan for them all anyway. This impl trait example is just another one we can't currently scan for.

The underlying query engine, Trustfall, is designed to make it easy to plug in additional data sources down the road, whether those are rustc, rust-analyzer, chalk, or something else which would allow us to cover this and other cases in the future. In the meantime, there are plenty of semver breaks that are possible to catch with rustdoc JSON alone. We could go to the compiler (or other tools) for those cases, but other semver linters have found drawbacks in such approaches which to my eyes seem more severe than what we're facing here.

I also wouldn't be at all surprised if it turns out that most semver-breaks that happen in practice are the simpler kinds that aren't caused by generics / impl trait / complex uses of lifetimes etc. In my mind, it would just be an analogue of "most injuries on a ski mountain happen on the easiest trails."

P.S.: I recently gave a 10min talk on Trustfall at a conference, which should help substantiate my claim that it wouldn't be difficult to integrate compiler-provided info into this tool's semver queries: https://predr.ag/talks/#how-to-query-almost-everything

llogiq · 2022-08-30T16:48:13Z

A different solution (albeit possibly equaly volatile) would be to use rust-analyzer's backend. It works on stable Rust, is packaged in crates and has type & trait information. If needed, you could pin a version and work on top of that.

epage · 2022-08-30T16:52:55Z

I asked matklad about rust-analyzer and he felt it wasn't a good fit. In general, I recommend people also check out that thread as we talk about different data sources.

obi1kenobi · 2023-05-29T22:13:36Z

With almost a year's worth of hindsight, I think rustdoc has served us well. We've been able to work closely with the rustdoc team in making both of our tools better, and we have every reason to believe the fit will be even better over time.

Thanks for the discussion, everyone!

obi1kenobi closed this as completed May 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concerns about use of `rustdoc` #97

Concerns about use of `rustdoc` #97

WaffleLapkin commented Aug 22, 2022

epage commented Aug 22, 2022

aDotInTheVoid commented Aug 24, 2022 •

edited

obi1kenobi commented Aug 24, 2022 •

edited

llogiq commented Aug 30, 2022

epage commented Aug 30, 2022

obi1kenobi commented May 29, 2023

Concerns about use of rustdoc #97

Concerns about use of rustdoc #97

Comments

WaffleLapkin commented Aug 22, 2022

epage commented Aug 22, 2022

aDotInTheVoid commented Aug 24, 2022 • edited

obi1kenobi commented Aug 24, 2022 • edited

llogiq commented Aug 30, 2022

epage commented Aug 30, 2022

obi1kenobi commented May 29, 2023

Concerns about use of `rustdoc` #97

Concerns about use of `rustdoc` #97

aDotInTheVoid commented Aug 24, 2022 •

edited

obi1kenobi commented Aug 24, 2022 •

edited