-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coveralls reporting a decrease overall (using parallel), even though nothing changed #1653
Comments
Hi @xcelerit-dev. Thanks for reporting your issue. This is something that's been happening lately, for a handful of repos as far as we know (example 1 | example 2). Specifically, the behavior is that PRs will show a decrease in coverage that, by the evidence, should not occur. Then, when the jobs in the PR build are re-run, the PR's coverage % corrects itself. Note these side-by-side comparisons between your base branch (LEFT) and your PR branch (RIGHT). First, it shows the decrease you found suspect: It looks suspect to me for the same reasons, and, in addition, while each individual job has the exact same coverage % compared to the base build, the run details tell a different numbers story: A build's coverage % always complies with the numbers in the RUN DETAILS. After I re-run the jobs in the PR build: We now see no change in coverage between base and PR. And the RUN DETAILS align in terms of relevant lines covered (even while hits per line changes a bit but does not affect coverage %). Unfortunately, we don't know the root cause of this issue. For some users it is consistent, for some occasional. Is it occasional or persistent for you? In your case, there is one thing I see wrong that I would like to correct to see if it has an impact (which could fix your builds, or otherwise be helpful in determining the root cause of the shared issue): Your builds are not sending status updates, and this can sometimes be cause by an expired OAuth token (of your repo owner in this case), which can cause other problems like failed API calls to Github, which could potentially have a role in what's happening with the first run of the PR build. To fix this I would normally reassign repo ownership to another user of the repo, ideally one with Admin access to the repo. But in your case, I see you are the only user, and that you have Admin access. Unless there is another user you're aware of that we can test with, we'll have to find out why status updates aren't reaching Github with the permissions that are attached to your OAuth token. Do you have any insight into why that may be? Please give me a few moments to explore further and I'll get back with my own findings. BTW, to set expectations, if resolving this issue doesn't fix your PR builds (meaning: stop future PR builds from reporting false coverage drops), then you are squarely in the group of repos having this problem with no current known cause. It means I'll have to add you to the ticket with those other projects and update you with any progress from there. |
Ok, it turns out I was wrong. You were not the owner of your repo. You were the only user of the repo, but your repo had no owner. I made you the owner and re-ran your builds, and they started sending status updates. (You can check Github to verify status updates on your last three (3) builds.) Please proceed and let me know, here, if any of your next PR builds also show an incorrect decrease in coverage %. I'm not sure if this change will resolve that issue, but it would be great to know. The other cases, BTW, did not have this issue AFAIK. |
Thank you for all your investigations - that was helpful. However, we see this happening again for the next PR:
While we could modify thresholds etc to make sure we can keep working (get checks to pass), that's not a solution. The coverage badge will also keep fluctuating for no reason. What do you suggest we can do to get around this? Keep triggering new builds in the PRs until we get a consistent result? |
We tried a rebuild of the whole workflow, but this only adds more jobs to the same coveralls build number and doesn't change the overall coverage percentage. |
Yes, I'm sorry to say, you're squarely in the pattern of this known but currently unresolved issue. Just to make sure, using this PR build as a test: I re-ran the jobs in the build and see the coverage % change / correct itself: Sorry. I'll add you to the issue and share any updates. |
Hi, we are affected as well:
We are getting this behavior for all pull request builds not. However, push builds reports the coverage OK. |
Hi, @OndraM. Thanks for the report. It sounds like we'll need to add you to this issue. Which I'll do as soon as I'm able to verify and make sure there isn't another underlying cause in your case. We're experiencing an extremely high volume of support requests this week and are working through a backlog. We'll respond to your issue as soon as we possibly can. Thanks. |
If more data points helps, I think https://github.com/Torgen/codex-blackboard also has this issue. The coveralls site shows the aggregated coverage, but the comment and check result show the lowest single sub-result of the parallel runs. |
Workaround for this issue: the Rerun Build WebhookWhile we've not yet identified a fix for this issue, we released a workaround today that should resolve it for you: the Rerun Build Webhook. Since the nature of the issue appears to be that, for some repos with parallel builds:
A Rerun Build Webhook, similar to the (Close) Parallel Build Webhook, fixes the issue by triggering your build to re-calculate itself. InstructionsCall this at the end of your CI config, after calling the (Close) Parallel Build Webhook. Call it like this:
But substitute your Please note a few differences between the Rerun Build Webhook and the (Close) Parallel Build Webhook:
|
@afinetooth Thank you for the description. We'd like to test this, but we're wondering how we can get the - name: Rerun coverage workaround
run: |
curl --location --request GET 'https://coveralls.io/rerun_build?repo_token=XXXX&build_num=YYYY' The repo token ( |
We worked out the required change. After placing a secret for the |
@xcelerit-dev apologies. I forgot that some Coveralls integrations (like the Coveralls GitHub Action) have special features around the (Close) Parallel Build Webhook, such that you don't need to build the request. You got it right. The This came up right away after I posted the solution here as well, so you can see a longer explanation there. |
@afinetooth Hi, thanks for the proposed workaround. It works sometimes, but it seems also to fetch random results from one of the parallel runs that had an even lower coverage result as a canonic value for the entire build. before the workaround we had I'm not sure if I did something wrong, but I added the
Replacing I was wondering if anyone encountered this oscillation after the workaround. |
@fabiode Can you share the URLs for 2-3 recent builds that this is happening for? I feel like recent changes to our code should make this original issue far less likely to begin with. But I'd also like to verify that we received your rerun webhook, etc. If you repo is private or sensitive, please email us at suport@coveralls.io and mention this issue. I will get it. |
@fabiode Got your request to support@coveralls.io and will reply there. Thanks. |
Hi,
We're struggling to understand why the coverage in a PR decreased in the overall coveralls reporting, even though the detailed numbers below suggest otherwise (and none of the test code was changed). It seems that it's one of the parallel jobs that is reported, not the overall result.
Here is a screenshot for reference:
This is the link: https://coveralls.io/builds/51121924
And it corresponds to this PR: auto-differentiation/xad#13
Any ideas?
The text was updated successfully, but these errors were encountered: