Semantic-release does not re-run on runs that failed post-prepare due to tag being there #3178

hanseltime · 2024-02-04T19:34:08Z

Current behavior

Note: Creating this after realizing that I went the feature route too quickly instead of the bug route for people to see and discuss the solve: Linked feature for context

If you use a plugin in your semantic-release file that fails due to an environment issue (auth issue or network error, etc.) in the publish phase, semantic-release will correctly fail that run with a log. However, if you remedy the environment issue and retry the same job on the same SHA, the run will appear to have passed even though it did not run any of the release steps since the tag is already there.

This leads to phantom assumptions about the release by CI/CD engineers because they see the green checkmark and assume that a release was cut and their work is done. It can also lead to a false sense of having fixed the environmental issue because we never get back to the release step that failed in the first place.

Expected behavior

If a release was not completed, re-running semantic-release should retry all steps again so that the one calling the job can be certain that they are retesting the same failure states.

In the event that idempotent publish methods are stacked (i.e. npm package publish and github publish - and we failed at github publish), the expected behavior is that a failure would occur in the npm publish phase with a descriptive error of "X version already exists" so that, as the CI engineer, I know that I need to tag and re-release my semi-broken branch state per the FAQ because the config in my repo makes it clear that I failed at a point that I declaratively control.

`semantic-release` version

21.1.2

CI environment

Github Actions

Plugins used

'@semantic-release/commit-analyzer''
'@semantic-release/release-notes-generator'
'@semantic-relesae/changelog'
'@semantic-release/npm'
'@semantic-release/github'

`semantic-release` configuration

module.exports = {
branches: [
'main',
{ name: 'develop', prerelease: true },
],
plugins: [
[
"@semantic-release/commit-analyzer",
{
preset: 'angular',
releaseRules: [
{ type: 'refactor', release: 'patch' },
{ type: 'perf', release: 'patch' },
],
},
],
'@semantic-release/release-notes-generator',
'@semantic-relesae/changelog',
'@semantic-release/npm',
'@semantic-release/github',
],
}

CI logs

Private repo release so I cannot link those logs.

It is the same logs as normal if you were to re-run a release on an actually released branch.

The text was updated successfully, but these errors were encountered:

hanseltime · 2024-02-04T20:03:53Z

As discussed in the above feature request, it seems that semantic-release does not want to introduce futher complexity to the configuration of the project and that the expansive plugin ecosystem means that it is very difficult to understand which plugins are using the git tags instead of the api payloads for the plugin.

Because of this, I can start by suggesing a certain number of solves for reacting to instead of trying to add cognitive load to the maintainers:

Solve 1: It's just a doc thing. We need to write up a "release does not mean publish" set of documentation and a standard operator procedure (maybe in the FAQ) for how to never re-run your semantic-release if it failed after prepare or maybe to talk about how to delete the tag on repo and then run re-release

Solve 2 (listed here for completeness but off the table since it was closed): provide an opt-in flag in the configuration that allows semantic-release change when it does tagging to post-publish with adequate documentation detailing that this may not play well with legacy plugins

Solve 3 (feels dangerous):** Writing a failure wrapper that removes the tag on a failure. (Note: this would need to use the failure command but would be brittle until a fix for not all errors triggering the failure command is implemented.

Longer term solves targeting the pluginecosystem as a whole with this assumption:

I'm assuming the plugins that clash with this paradigm are the ones that are reaching out to a repo system (like github) and are calling API methods that do something to the effect of "release tag X that is on the branch". Which does create a bit of a circular dependency.

Longer term solve 1: semantic-release applies and consumes a "finished" tag or set of notes (probably notes I'm guessing). In this one, semantic-release pushes a note to the SHA that it is releasing after it has a successful publish phase that indicates "publish finished". Semantic-release would then perform an additional "do I re-run" lookup, where it would check to see if the tag AND release note are both present in a completed state. If the tag exists and the note does not, then semantic-release would re-release with the "nextRelease" actually being the tag of the current SHA.

For this solve, we could even flesh in a better note syntax so that we could log plugins that published successfully. That would also allow us to skip idempotent publishes once they succeeded on re-run, although, that would potentially be a v2 that could be thought out later on with the simple "publish succeeded" note and some documentation around adding a new release commit if failing an idempotent publish as the building block for that.

(I was going to suggest a long term solve around upgrading plugins being allowed to drive tags and interleave when the tag gets sent up, but as I wrote it out, it became a massive mess of failure points and complex edge cases that robbed semantic-release of its flow control. So I really only have the one long term solve for a reaction. Which I personally think might be a good solve).

I am amenable to implementing any solutions that might be agreed upon in here, so please use this as an opportunity to agree on the solve without the concern of creating an unreasonable expectation on implementation.

travi · 2024-02-05T03:57:13Z

Solve 1: It's just a doc thing. We need to write up a "release does not mean publish" set of documentation and a standard operator procedure (maybe in the FAQ) for how to never re-run your semantic-release if it failed after prepare or maybe to talk about how to delete the tag on repo and then run re-release

this is where i'd recommend starting. fully admit that our docs could use some help and this is a common stumbling point for folks that havent run across the situation before. while it may not be called out as clearly as it could be in the docs, our message should be mostly consistent throughout our support channels over time that the best course of action is to delete the tag after fixing the problem that resulted in the partial release and re-run the release pipeline again. making this more clear in the docs would certainly be helpful for folks.

Solve 3 (feels dangerous):** Writing a failure wrapper that removes the tag on a failure.

i agree that this is dangerous and can say that this is a big part of why semantic-release does not yet have this feaure built in. removing the tag is less of a concern than undoing other release steps. removing the tag is a step toward making the overall release process atomic, which is very difficult befause of how many independent systems a release could interact with. at minimum, every plugin would need a way to roll back it's own steps.complicating matters further, some systems prevent rolling back even being possible. as previously mentioned, an example of this is the npm registry. when a version is unpublished, it can never be published again. removing the tag based on a failure after successfully publishing would leave the next release attempt in a state that would be far more difficult to overcome than removing a tag and trying again.

it is the current opinion of the maintainers of this project that it is unrealistic for semantic-release to be able to handle releases in a fully atomic way, especially without significant additional complexity beyond the current state of the project. it is beyond the scope of semantic-release to automate your release to the point where you no longer need to understand the steps being taken in your unique process. if there is a failure during the process, we believe a human is needed to understand the point at which the release failed to understand what steps should be rolled back and how, or if enough of the release completely successfully to leave the completed steps in place and resolve the problem for the next release. it is worth noting again that partial releases are why we consider it unsafe to remove a tag automatically after a failure, but also why we consider it unsafe to delay addition of the tag. if a failure happens after a step that cannot be undone and the tag has not yet been added, future release attempts will try to release the same version again and will fail over and over.

hanseltime · 2024-02-05T04:20:59Z

Thanks for the info @travi. I will take a crack at adding some documentation in a place that would've assuaged this if you're okay with that.

It sounds like we both agree that the Solve 2 and Solve 3 are non-feasible for the sake of semantic-release.

I'm curious what your opinion is of the "Longer term solve 1". While I don't expect semantic-release to manage all of my release, the way it is presented is that it is a declarative pipeline file that triggers a success phase when release is done. If semantic-release does not get to that success command, that would imply that it did not reach success and should retry again if re-run.

With this current bug, semantic release's contract is more like "a declarative pipeline that triggers a success phase when release is done and will not re-run once it has failed any sort of publishing".

To solve that without config changes, we could just make use of git notes (which we know aren't driving any plugin flows at the moment), to track when the publish phase finishes and we have indeed either "reached success" or "reached failure", which are the two contractual states of the config file and flow orchestration by semantic-release.

travi · 2024-02-05T04:51:03Z

I'm curious what your opinion is of the "Longer term solve 1".

i need to give this suggestion some more thought, which i dont have the space for giving it fully justice at the moment. there is certainly some potential to the suggestion, but i expect that there are some scenarios that would continue to pose large roadblocks. even if a full implementation were contributed, there is also significant risk that this level of additional complexity for the project results in far more maintenance/support burden than continuing to consider human intervention in scope in failure scenarios.

With this current bug

i understand that the current behavior is not ideal, but as maintainer of the project, i want to set clear expectations around this. i do not consider this to be a bug. this is intended behavior with the current state of the project to expect human intervention in failure scenarios.

we could just make use of git notes (which we know aren't driving any plugin flows at the moment)

while i agree that git notes is likely the best way to approach this sort of feature if we were to move forward with it, it does add more user-facing complexity to manage when there is a need to troubleshoot a problem. worth noting that we do already use git notes in our supported workflows, like pre-releases or maintenance releases. they already result in more difficult problem resolution than just deleting a tag in situations where someone rewrites the history of a release branch. adding more complex details there would need to be done with great care, possibly even going to the length of developing a tool to repair damaged notes interactively.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Semantic-release does not re-run on runs that failed post-prepare due to tag being there #3178

Semantic-release does not re-run on runs that failed post-prepare due to tag being there #3178

hanseltime commented Feb 4, 2024 •

edited

hanseltime commented Feb 4, 2024 •

edited

travi commented Feb 5, 2024

hanseltime commented Feb 5, 2024

travi commented Feb 5, 2024

Semantic-release does not re-run on runs that failed post-prepare due to tag being there #3178

Semantic-release does not re-run on runs that failed post-prepare due to tag being there #3178

Comments

hanseltime commented Feb 4, 2024 • edited

Current behavior

Expected behavior

semantic-release version

CI environment

Plugins used

semantic-release configuration

CI logs

hanseltime commented Feb 4, 2024 • edited

travi commented Feb 5, 2024

hanseltime commented Feb 5, 2024

travi commented Feb 5, 2024

hanseltime commented Feb 4, 2024 •

edited

`semantic-release` version

`semantic-release` configuration

hanseltime commented Feb 4, 2024 •

edited