Lint to detect semantic versioning failures #374

nikomatsakis · 2014-08-14T18:56:16Z

It would be nice if cargo can give you warnings when you try to publish a new version of a package that obviously breaks the API of a previous version.

An important but easily overlooked part of this would be whether a type is sendable etc.

thehydroimpulse · 2014-08-14T19:04:23Z

Perhaps when a registry is online it could outright refuse (perhaps with an override) to push a new change when a breaking change happened and the semantic versioning has not been appropriately changed (i.e., they bumped the patch, but the change needs to bump the major).

steveklabnik · 2014-08-14T20:37:40Z

Yeah, there has been some small amount of discussion about attempting to determine breakage as part of the metatdata the central server has.

wycats · 2014-08-14T21:57:02Z

@thehydroimpulse yes, that's the idea 😄

emberian · 2015-07-09T08:40:59Z

The tool should be separate from cargo, with a subcommand that shells out to it.

conradkleinespel · 2016-01-15T23:40:32Z

I'm very much interested in this issue. I would like to help.

Given that this issue is marked as E-hard, how can I know if I would be able contribute in a meaningful way? Would you happen to have any pointers as to what I should look into? I'm thinking clippy might be a good way to learn about lints.

Even if I turn out not to be skilled enough to actually code the thing, I'd be happy to help with testing, documenting or anything else related to this issue.

Side note @cmr: why should this be separate from cargo?

alexcrichton · 2016-01-16T01:11:36Z

Yeah this is indeed unfortunately E-hard because it will involve some significant and independent design and implementation work. There's not much of a skeleton to go off today other than the idea that this would in theory boil down to an "AST diff" tool.

jonas-schievink · 2016-01-16T11:03:58Z

I've been thinking that we could use the incremental compilation infrastructure (once it's set up): Save incr. comp. data for the old version, and check what has changed in the new version. Of course, this would still need a set of "breakage rules" that dictate which changes are OK for which kind of version bump, but at least finding changes is then done by the compiler.

conradkleinespel · 2016-01-18T15:37:10Z

@alexcrichton Alright, thanks.
@jonas-schievink Oh, that sounds like a good idea. I'm off to read the relevant RFC then.

dwijnand · 2018-09-02T20:35:23Z

Looks (unless I'm confused) like GSoC 2017 led to the creation of https://github.com/rust-lang-nursery/rust-semverver, which seeks to address this. Looks like that needs more work, in anyone here is interested.

epage · 2022-07-25T12:26:52Z

cargo-semverver uses the metadata in an rlib plus some hard coded rules to determine what a breaking change is

cargo-breaking originally was parsing the crate using cargo expand + syn but got bogged down under the weight of re-implementing a lot of semantics. It was in the middle of switching to internal rustc APIs for the parsing.

cargo-crate-api is a PoC for using the json output from rustdoc to detect breaking changes. cargo-public-api is somewhat similar.

One approach that I've not seen used is to build on top of rust-analyzer.

To me, the ideal solution would

Detect breaking changes in features exposed and not just the rust api
Detect breaking changes in one run and take into account conditional APIs, including platform and features
Expose different breaking change rules as lints so people can enable/disable them as they wish, considering there is some wiggle room for breaking changes
Allow comparing the working dir to a git rev, path, or registry version
Support multiple report outputs
- Adds, changes, and removes for including in a changelog
- Lints of what broke
- Potential version (since behavior changes might exist, we need to be clear to not give people a false sense of security)
Works on stable
- If not, at least has a wide range of nightly versions it is compatible with

Breaking down potential data sources:

rust-analyzer: Not having worked with it, I can't really say how well it can meet these goals.
- Either using its internals or by going through LSP
rlib:
- I have the impression that the internal rustc APIs for introspecting on rlib's is fairly unstable, requiring a narrow range of nightlies
- Requires recompiling for every set of features and platforms (which requires every feature combination to compile)
rustdoc json
- I have the impression that there isn't a lot of churn, allowing a wide range of versions to be compatible. We can also use a parser that supports a wider range
- I think this will give us conditional compilation annotations so we can compare them in a single run

badboy · 2022-07-25T12:34:33Z

There's also cargo-semver-checks now (cc @obi1kenobi), also based on rustdoc output.

epage · 2022-07-25T12:40:01Z

To help cover the importance / use case for this tool (at least partly obvious but still want to cover it).

The more obvious is to prevent unintentional breaking changes like recently happened with h hit clap recently due to an auto-trait.

The less obvious is the impact on maintainers and their ability to scale up and remove toil in their development. cargo is fine making breaking changes but tracking when to bump the breaking version field is still work. We recently had a breaking change released without bumping the right field (#10803). We don't have a lot of process in place (e.g. conventionalcommits.org/) to help catch these. I feel the overhead of tracking breaking changes is one of several limiting factors for the cargo repo to containing more crates and resolving this would be a big help for the cargo team. Providing a trivial way of checking the status of all crates and helping to write the release notes would make it easier to scale up what maintainers are able to deal with much like the difference I've had in manually publishing crates and switching to cargo-release.

EDIT: It'd also shift documentation to automation, see #8736

epage · 2022-07-25T12:51:31Z

@JohnTitor I'd be interested in hearing the rust-semverver perspective on my earlier comment, including what your thoughts are on

Alternative data sources like rustdoc
Expanding the scope to include checking the manifest for breaking changes (I don't think this is handled currently)

ehuss · 2022-07-25T13:45:02Z

Regarding rust-semverver:

I have the impression that the internal rustc APIs for introspecting on rlib's is fairly unstable, requiring a narrow range of nightlies

There was an attempt to bring it into the rust-lang/rust repo so that it would always be in a buildable state (like clippy or rustfmt), and thus usable on stable. I think that is still an option, it just needs active people to maintain it. I think this is still an attractive option, since having the full power of the compiler to run queries (like trait and method resolution) can be useful. I do have concerns about the complexity of rust-semverver and how it is implemented. The issues on the issue tracker look somewhat concerning.

OTOH, using rustdoc JSON could be much lighter weight and easier to handle. However, it looks to have some limitations. For example, I don't see a way to inspect macro definitions with it (which would be needed to see how they change). Also, it is not clear what trajectory it has for stabilization, which would be required here. It is not clear how far away that is.

epage · 2022-07-25T14:24:46Z

I think this is still an attractive option, since having the full power of the compiler to run queries (like trait and method resolution) can be useful.

Agreed. We would need a way though for users to skip checking some parts of the API (if it doesn't already have it) since some items will be exposed only for the sake of macros and, depending on how things are setup, the author can break those as needed.

This is also why I'm curious about using rust-analyzer for a base. It is working towards having the full power of the compiler but I'm assuming it doesn't prefer one set of conditional compilation settings but tracks all states the code can run in. If that is the case, then that would be the most ideal for such a tool.

OTOH, using rustdoc JSON
...
Also, it is not clear what trajectory it has for stabilization, which would be required here. It is not clear how far away that is.

Yes, for both approaches, becoming a rustup component (or included with one) would resolve the nightly issue.

obi1kenobi · 2022-07-25T14:58:41Z

OTOH, using rustdoc JSON could be much lighter weight and easier to handle. However, it looks to have some limitations. For example, I don't see a way to inspect macro definitions with it (which would be needed to see how they change). Also, it is not clear what trajectory it has for stabilization, which would be required here. It is not clear how far away that is.

It's true that rustdoc JSON is somewhat limited — another thing it can't be used to check (AFAICT) is re-exports, since the JSON format doesn't seem to mention them at all: obi1kenobi/cargo-semver-checks#2

On the flip side, it seems that docs.rs is considering hosting the rustdoc JSON alongside the HTML docs when the JSON output is stabilized (rust-lang/docs.rs#1285) which would make checking even faster and more convenient in the common cases. For cargo-semver-checks, the process of generating the JSON files is the long pole in the runtime — the checking itself takes milliseconds for most crates that are not egregiously large.

In my view, we don't want an "either-or" but an "and." We won't find a single solution that dominates all others. For example:

It's going to be hard to beat rust-semverver for thoroughness and ability to check for the most complex semver breaks, because it has the full power of the compiler.
It's going to be hard to beat cargo-semver-checks for speed and extensibility because it's based on a query engine specifically designed for speed and extensibility. Implementing a new rule is just writing a query in a strongly-typed DSL that's syntactically GraphQL but with more expressive semantics. But it probably won't be able to check the most complex semver rules involving generics (e.g. GATs), specifically because it doesn't do all the work that rust-semverver does.
It's going to be hard to beat cargo-public-api for generating a diff suitable for inclusion in a changelog, since that's specifically what it's designed for. Etc.

With the above in mind:

I'd argue for a "best tool for the job" approach.
I'd like to work out a common set of names for the different semver rules + design an attribute to disable checking a particular rule that all tools would understand. This will make using different tools for different jobs practical and ergonomic.
I think it's substantially less than day's worth of work to make cargo-semver-checks aware of manifests and their semver-breaking cases. I think cargo-semver-checks is the tool for that, and I'd be happy to build it around / after RustConf.

I'm also excited about the idea of making rust-analyzer-provided information queryable through cargo-semver-checks i.e. through a Trustfall query API. If anyone with knowledge of rust-analyzer wants to pair-program a prototype, you know where to find me and you just need to name a day and time compatible with US/Eastern time.

epage · 2022-07-26T14:17:10Z

@matklad what are your thoughts on using rust-analyzer (internal or LSP API) as the data source for a cargo-semverver-like tool?

In particular

Would it generally be sufficient?
Could it handle cases that rustdoc json output can't, like macro definitions?
Could we superimpose all configurations on top of each other rather having to re-run per configuration like cargo-semverver requires since it relies on rlib

matklad · 2022-07-26T14:38:52Z

Using rust-analyzer for semverver would give you a nice prototype, but woun't scale to production readiness, for two reasons:

rust-analyzer isn't precise enough for that (it has false errors and such)
rust-analyzer doesn't provide any kind of stable API

Could we superimpose all configurations on top of each other rather having to re-run per configuration like cargo-semverver requires since it relies on rlib

So, there are two kinds of things you can do:

you can build a heuristic best effort tool with some amount of false positives and false negative, and than, sure, you can try to analyze all the cfgs together
you can build an exact tool that follows the language semantics. In the latter case, you have to have a loop over all combinations of cfg flags: that's just how the language works, everything semantically relevant becomes to exist only after expansion.

This was referenced Sep 24, 2014

crates.io: use rustdoc to work on ABI stuff #47

Closed

Compare API changes using the extracted stability index. rust-lang/rust#9060

Closed

Create an apitest tool for stabilizing core interfaces. rust-lang/rust#6508

Closed

alexcrichton added the E-hard Experience: Hard label Oct 20, 2014

blaenk mentioned this issue Jul 9, 2015

recommend semver-abiding version number #1793

Closed

ehuss mentioned this issue Jul 14, 2020

Concept: cargo publish automatically running this tool. rust-lang/rust-semverver#130

Open

epage mentioned this issue Apr 21, 2022

Additional SemVer compatibility items. #8736

Open

40 tasks

epage mentioned this issue Aug 22, 2022

Concerns about use of rustdoc obi1kenobi/cargo-semver-checks#97

Closed

epage added A-new-subcommand Area: new subcommand S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted. and removed E-hard Experience: Hard labels May 3, 2023

epage mentioned this issue Jun 27, 2023

Check Cargo.lock in version control for libraries #8728

Closed

SchmErik mentioned this issue Mar 5, 2024

Investigate tools for checking semver violations risc0/risc0#1517

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lint to detect semantic versioning failures #374

Lint to detect semantic versioning failures #374

nikomatsakis commented Aug 14, 2014

thehydroimpulse commented Aug 14, 2014

steveklabnik commented Aug 14, 2014

wycats commented Aug 14, 2014

emberian commented Jul 9, 2015

conradkleinespel commented Jan 15, 2016

alexcrichton commented Jan 16, 2016

jonas-schievink commented Jan 16, 2016

conradkleinespel commented Jan 18, 2016

dwijnand commented Sep 2, 2018

epage commented Jul 25, 2022 •

edited

badboy commented Jul 25, 2022

epage commented Jul 25, 2022 •

edited

epage commented Jul 25, 2022

ehuss commented Jul 25, 2022

epage commented Jul 25, 2022

obi1kenobi commented Jul 25, 2022

epage commented Jul 26, 2022

matklad commented Jul 26, 2022

Lint to detect semantic versioning failures #374

Lint to detect semantic versioning failures #374

Comments

nikomatsakis commented Aug 14, 2014

thehydroimpulse commented Aug 14, 2014

steveklabnik commented Aug 14, 2014

wycats commented Aug 14, 2014

emberian commented Jul 9, 2015

conradkleinespel commented Jan 15, 2016

alexcrichton commented Jan 16, 2016

jonas-schievink commented Jan 16, 2016

conradkleinespel commented Jan 18, 2016

dwijnand commented Sep 2, 2018

epage commented Jul 25, 2022 • edited

badboy commented Jul 25, 2022

epage commented Jul 25, 2022 • edited

epage commented Jul 25, 2022

ehuss commented Jul 25, 2022

epage commented Jul 25, 2022

obi1kenobi commented Jul 25, 2022

epage commented Jul 26, 2022

matklad commented Jul 26, 2022

epage commented Jul 25, 2022 •

edited

epage commented Jul 25, 2022 •

edited