New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aggregate results do not get reported correctly to GitHub #1631
Comments
@jrfnl May I re-process the jobs in your parallel build to make sure this is not a one-off issue? This will re-consume each of your coverage reports as they came in and re-perform your aggregate coverage calculation. If the results are the same, then we are on more solid footing trying to determine a root cause. Please let me know when you can. Thanks. |
@afinetooth Please feel free, but I have seen the same issue with multiple PRs now, so it definitely isn't a one-off. Some other examples:
|
@jrfnl some quick updates: On this PR build, mentioned above: I re-ran the jobs in the PR build and got a different result—specifically, the small increase in coverage you'd expect from the underlying numbers: As for the next PR build mentioned above: I noticed a discrepancy in the parallel jobs listed for the build: Whereas in the base build: In the related PR build: We have a different set. At least the first job, 2148532125.1, appears to relate to a different PHP version ( Finally, a similar, but different, sort of discrepancy is at play in the example PR build from your original post: Whereas the first job in your list of jobs in the base build is for PHP That build is listed second in the related PR build (and version In other words, they've swapped places. The potential issue: In other words, Coveralls expects that Job Again, while those jobs may run in parallel fashion on your CI runner and arrive at coveralls.io in a different sequence due to timing, the index of the job ( This is achieved by setting up your CI config in a consistent order. In other words, it's expected that you're running through your matrix build in the same order each time, so that jobs for each individual scenario (ie. Next steps / how to fix: On the other hand, the former issue (different jobs between builds) is harder for me to understand or suggest a fix. Can you give me some more insight into what's happening on the CI side? |
Well, yes, one of the builds referenced above switched a PHP version (PHP 7.4 to 8.1), but it didn't change the order of the jobs, it was just a straight swop out. As for the numeric suffixes (index): I don't pass those. I pass just and only the PHP and PHPCS version, that's it. Other than that: I honestly don't understand why it would matter for the aggregate results whether the jobs line up. This is the workflow I use (relevant parts highlighted) to generate the base input and upload to coveralls: I'll be making some more changes to the matrix soonish, but again, I'd expect that to have a positive impact on the aggregate result, not a negative impact because individual jobs don't line up... It's the mismatch between the sum-total of code coverage as shown for the class tree vs the aggregate result in the build header/reported to GH which I don't get (the numbers pointed out by the arrows in the above screenshots). |
@jrfnl, Ok, thanks for the update. I understand your concern better now, so I'll stay focused on the aggregate coverage and let the job-to-job comparison be secondary. In fact, as I look closer at some of those cross-build jobs, I see that we actually seem to be tracking them by their So, re: aggregate coverage: Specifically, this build: Is correct now (using the arrows you referred to above): As for next two pairs of builds, which showed similar issues to the original: Pair 1:
And: Pair 2:
I am going to re-run the jobs for the PR Builds to see if it corrects the issue in the same way as the first example... And it does: Pair 1:
And: Pair 2:
So this convinces me that Coveralls is not making an error in the calculation of your aggregate coverage, just that something is interfering with the way it is receiving the jobs from your original POSTs. For instance, potentially, the calculation may be starting before the final job is received. Or something else. At this point I'm really not clear. I will dig a little deeper for a root cause, starting with your workflow file. Speaking of which, upon review I'm not seeing the invocation of the Coveralls Github Action ( So, does that mean you're mixing two integrations?:
(I'm not sure that would be the issue, but it's atypical, which way warrant a closer look.) |
To see if it clears the issue, we could run a quick test, without using the Coveralls Github Action. That would mean sending the Parallel Build Webhook in your final step in the most generic fashion (using
Excuse my syntax, which might not be 100% correct. php-coveralls notes on parallel builds and sending the parallel build webhook, here. |
I suppose it does. Then again, I had a hell of a time getting parallel build merging to work in the first place when these projects made the move from Travis to GH Actions. The documentation on this is sparse, to say it in the most polite way possible. I don't exactly remember how I ended up with this set up, but no doubt it was based on a combination of information found in official documentation from php-coveralls, Coveralls itself and the coveralls action runner, and reading up on other people have solved similar issues, like here. I mean, the short of it is:
I did a lot of research and trial and error runs around that time as I was moving the CI from some 40+ projects over to GH Actions. Also see earlier communications between us about this setup. This set-up works - and this can also be seen in the aggregate of the Happy to try to send the webhook via Curl, but I'm not sure what difference that would make. I mean: the "finished" hook gets received correctly and in the GH Actions workflow, the |
@jrfnl thanks for the update:
Appreciate that. You're right. We've been working on a documentation update that's been long-coming, but we are close to releasing it. As an aside, one caveat is that, because Coveralls integrations are third-party OSS projects, their docs won't be unified with ours and some confusion like this may persist. Our recent attempts to address this have gone toward a universal coverage reporter---in beta---that will allow all integrations to POST coverage to coveralls.io in a consistent manner. (At this time, however, the project only supports LCOV and Simplecov formats. The hope is that future OSS contributions will be toward adapters for new formats.)
Thanks for the background. I did just review our earlier communication and I recall what a challenge you faced in migrating your projects and I'm sorry about that. I understand you're at the end of a long effort that actually applies to many repos, so I didn't mean to be flip by suggesting a different way of sending the webhook. TBH, I was reaching for a root cause and seizing on anything off-pattern that could be isolated and removed as a possible factor. Having seen this type of behavior before, it's been my experience that, even if they're not technically "one-offs," they are usually temporary. The fact that we can re-run your jobs and get correct aggregate calculations indicates some sort of interference with that regular process that happens in real-time when you send your initial POSTs. Without absolute proof, my best theory involves temporary environmental factors, like severe traffic or an API outage, where something like this is more likely to happen: a particular coverage job POST fails to succeed on its first try and is sent to a retry queue, but the webhook call succeeds before it and closes the build. Then the retry succeeds and adds a job to the parallel build, but the total results are not re-calculated because the close webhook (trigger) does not re-fire. I wish I had not re-run all of your example builds, because then I could test this theory by checking timestamps, etc. If you have any more instances, please do send them so I can do more investigation before re-running to test the nature of the issue. In the meantime, to address to observation that this is happening for PR builds and not for push builds (on |
I am seeing this issue in all of my repos that are using coveralls.. All of them are open source projects so you can easily see the issues. Here is the last one https://coveralls.io/builds/48906664 The decrease in the individual test variants is due to branches added that only get run on certain versions of ActiveRecord but the "merged" coverage is 100%, yet it is reporting 97.35%. Also the latest the branches in SchemaPlus repos also have the issue ( https://coveralls.io/github/SchemaPlus ) |
Workaround for this issue: the Rerun Build WebhookWhile we've not yet identified a fix for this issue, we released a workaround today that may resolve it for you: the Rerun Build Webhook. Since the nature of the issue appears to be that, for some repos with parallel builds:
A Rerun Build Webhook, similar to the (Close) Parallel Build Webhook, fixes the issue by triggering your build to re-calculate itself. InstructionsCall this at the end of your CI config, after calling the (Close) Parallel Build Webhook. Call it like this:
But substitute your Please note a few differences between the Rerun Build Webhook and the (Close) Parallel Build Webhook:
|
@afinetooth Hi James, thanks for getting back to us. I'll implement this tomorrow in a few repos for which I expect a fair number of builds over the next few weeks. I will report back on how well it works. One question before I implement this - in GH Actions would you recommend implementing this as a separate (or should I implement it one way in one repo, the other way in another and see what works best ?) |
Looking your instructions over again - I actually have another question:
The Neither the upload of the test results via PHP Coveralls, nor the finish hook via the Coverallsapp GH Action require for me to set that manually, so I have no clue what to fill in for The workflows in which I encounter the issue typically look like this (shortened to the relevant steps): coverage:
runs-on: ubuntu-latest
steps:
# <snip>Multiple steps to run the tests....</snip>
- name: Install Coveralls
if: ${{ success() }}
run: composer global require php-coveralls/php-coveralls:"^2.4.2" --no-interaction
- name: Upload coverage results to Coveralls
if: ${{ success() }}
env:
COVERALLS_REPO_TOKEN: ${{ secrets.GITHUB_TOKEN }}
COVERALLS_PARALLEL: true
COVERALLS_FLAG_NAME: php-${{ matrix.php }}
run: php-coveralls -v -x build/logs/clover.xml
coveralls-finish:
needs: coverage
runs-on: ubuntu-latest
steps:
- name: Coveralls Finished
uses: coverallsapp/github-action@master
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
parallel-finished: true |
Hi @jrfnl. You're so right. I forgot about the (Close) Parallel Build Webhook being a special feature in some integrations, not requiring you to build up that request. So what we're looking for, for That value will be different for different CI service / integration combinations, but for the Coveralls Github Action, the correct environment variable will be You're right about
That's actually a great question. I have only tested the endpoint itself, not particular CI configs, so you'll be one of the first. You may be doing us all a favor by testing both ways, but my recommendation is to start with a new At this point, I don't expect race conditions (where rerun build completes before close build (and calculate coverage)), because I believe those jobs run in transactions, but I suppose it's a possibility if the former fails and has to retry. |
The aggregate coverage results reported back by Coveralls for PRs have been off for quite a while now. While the Coveralls team has not found the root cause yet, they have put an endpoint in place which _should_ help in working around this. The short of it is that the initial run results get calculated incorrectly, but that a re-run of the same coveralls build gets the correct results for the calculation, so triggering this calculation re-run from within the build _should_ (theoretically) fix the issue. I am the first to implement this solution though, so consider this a trial for the time being. Ref: * lemurheavy/coveralls-public#1631
The aggregate coverage results reported back by Coveralls for PRs have been off for quite a while now. While the Coveralls team has not found the root cause yet, they have put an endpoint in place which _should_ help in working around this. The short of it is that the initial run results get calculated incorrectly, but that a re-run of the same coveralls build gets the correct results for the calculation, so triggering this calculation re-run from within the build _should_ (theoretically) fix the issue. I am the first to implement this solution though, so consider this a trial for the time being. Ref: * lemurheavy/coveralls-public#1631
The aggregate coverage results reported back by Coveralls for PRs have been off for quite a while now. While the Coveralls team has not found the root cause yet, they have put an endpoint in place which _should_ help in working around this. The short of it is that the initial run results get calculated incorrectly, but that a re-run of the same coveralls build gets the correct results for the calculation, so triggering this calculation re-run from within the build _should_ (theoretically) fix the issue. I am the first to implement this solution though, so consider this a trial for the time being. Ref: * lemurheavy/coveralls-public#1631
The aggregate coverage results reported back by Coveralls for PRs have been off for quite a while now. While the Coveralls team has not found the root cause yet, they have put an endpoint in place which _should_ help in working around this. The short of it is that the initial run results get calculated incorrectly, but that a re-run of the same coveralls build gets the correct results for the calculation, so triggering this calculation re-run from within the build _should_ (theoretically) fix the issue. I am the first to implement this solution though, so consider this a trial for the time being. Ref: * lemurheavy/coveralls-public#1631
The aggregate coverage results reported back by Coveralls for PRs have been off for quite a while now. While the Coveralls team has not found the root cause yet, they have put an endpoint in place which _should_ help in working around this. The short of it is that the initial run results get calculated incorrectly, but that a re-run of the same coveralls build gets the correct results for the calculation, so triggering this calculation re-run from within the build _should_ (theoretically) fix the issue. I am the first to implement this solution though, so consider this a trial for the time being. Ref: * lemurheavy/coveralls-public#1631
The aggregate coverage results reported back by Coveralls for PRs have been off for quite a while now. While the Coveralls team has not found the root cause yet, they have put an endpoint in place which _should_ help in working around this. The short of it is that the initial run results get calculated incorrectly, but that a re-run of the same coveralls build gets the correct results for the calculation, so triggering this calculation re-run from within the build _should_ (theoretically) fix the issue. I am the first to implement this solution though, so consider this a trial for the time being. Ref: * lemurheavy/coveralls-public#1631
The aggregate coverage results reported back by Coveralls for PRs have been off for quite a while now. While the Coveralls team has not found the root cause yet, they have put an endpoint in place which _should_ help in working around this. The short of it is that the initial run results get calculated incorrectly, but that a re-run of the same coveralls build gets the correct results for the calculation, so triggering this calculation re-run from within the build _should_ (theoretically) fix the issue. I am the first to implement this solution though, so consider this a trial for the time being. Ref: * lemurheavy/coveralls-public#1631
Thanks for gettting back to me @afinetooth !
Thanks for that! I've also confirmed that PHP Coveralls uses the same variable when submitting the test run results:
I've set up PRs for a couple of repos to try this out:
In all cases, the "Rerun" step looks to succeed, but actually fails. I've tried with both In the first case -
Src: https://github.com/PHPCSStandards/PHPCSExtra/actions/runs/3090849883/jobs/5000239773#step:2:11 In the second -
Src: https://github.com/PHPCSStandards/PHPCSExtra/actions/runs/3090983008/jobs/5000542473#step:2:10 Next, I tried using
Src: https://github.com/PHPCSStandards/PHPCSExtra/actions/runs/3091059721/jobs/5000714503#step:2:10 All PR builds for the above mentioned PRs show a difference in code coverage, while each PR only touches the GH Actions workflow, not the code, so the difference should be (and in reality is) 0%.
I presume the rerun fails due to the "Invalid repo token", even though each of them send the Personally, I also find the silent failing of the step, even though there is an error, problematic. |
Hi @jrfnl thanks so much for running those tests. I think I know what the issue may be and will try to fix it and re-deploy today. I will let you know here. In the meantime, I'm not sure I agree about the "silent failing" of the job for two reasons:
I will definitely consider trying to change that, but for now I would prefer to address it two (2) ways: (1) first, fix the bug, so that, after completing the config once, users have a reasonably good expectation that the same config will always work; and (2) We'll be adding a Rerun Build button to the build page UI, so this can be a backup measure if, as a user, you go to the build page and suspect that the coverage is wrong / that the rerun failed. Hopefully this can suffice for the time being. |
Hi, shouldn't Using the project token, the rerun payload gets accepted. (However the coverage is still not calculated correctly, the value in the header is still inconsistent and much lower then coverage displayed below the "Source files" header). |
@OndraM Good question, though in that case, I wonder if that should be changed in more places (and in the documentation in various places too...) See this workflow example: #1631 (comment) |
But then how is Github Action supposed to find that repo-token? |
In that case, each user should add the repo token from Coveralls to their repo/organisation "secrets" (Settings -> Secrets -> Actions) and then refer to the secret in the GH actions workflow using something like |
There is already a link between GitHub Actions and Coveralls. it is much easier for them to set the path to use |
Agreed, I was just answering your question on the "how", not advocating for making this change, which would mean a lot of work for every single user of Coveralls. |
@afinetooth Any news ? |
@afinetooth, any updates on this issue? |
@OndraM @jrfnl @ylavoie @guy-yogev-lmnd Sorry for the delay in getting back to this issue. Thanks for your patience. I am checking each referenced build and will reply in kind momentarily. |
Yes. That's correct. Sorry if I mislead your or @jrfnl on that point.
Can you please share the Coveralls URL for the Pull Request build that's showing the error. One important point is that the aggregate coverage (in the header) will not always match the source file coverage at the top of the source files tree. That's because the SOURCE FILES table displays line coverage, and the header may be displaying both line coverage and branch coverage. branch coverage is included/excluded from the aggregate coverage calculation with this repo setting. (By default, branch coverage is ENABLED, so if it's present in your LCOV report, it will be applied in the calculation): Here is the aggregate coverage calculation formula when branch coverage is included:
Without branch coverage, it's just:
Another thing to consider is that, even if the aggregate coverage is wrong, it should be accurate per the data in the RUN DETAILS section. If you share your build URL, we can look at those details together. Thanks. |
Exactly, thank you. As a workaround, it's not the most ergonomic solution. it's much simpler in other CI contexts, due to the fact that Github Actions use the |
This is true. As I say above, as this is a workaround, it's not a very ergonomic solution. It's much more straightforward to implement in the context of another CI, like CircleCI, or Travis, just because only the |
I'm sorry, I failed to report here that I deployed the fix I had in mind. in that case, then I'm afraid it did not work for you, at least in the build you tested. I would like to look at that one again, if you wouldn't mind sharing the Coveralls URL for it. |
I understand completely. Can you please share the Coveralls URL(s) for the latest build(s) this workaround did not work for. The verbose build log for the same would be great, but not necessary. Anecdotally, this workaround is not working for about 2/10 users. While the issue itself is being worked on, I'd like to make sure that the workaround, at least, is 100%. |
The aggregate coverage results reported back by Coveralls for PRs have been off for quite a while now. While the Coveralls team has not found the root cause yet, they have put an endpoint in place which _should_ help in working around this. The short of it is that the initial run results get calculated incorrectly, but that a re-run of the same coveralls build gets the correct results for the calculation, so triggering this calculation re-run from within the build _should_ (theoretically) fix the issue. I am the first to implement this solution though, so consider this a trial for the time being. Ref: * lemurheavy/coveralls-public#1631
@jrfnl as I missed the opportunity before, I will use your last PRs mentioned above as cases to look into right now:
About which, you explained:
Makes sense. I will show them side-by-side with their base builds (LEFT), since you indicated the pull_request build (RIGHT) should not have changed in coverage. Since aggregate coverage should match the underlying data in the RUN DETAILS section of each build, I'll highlight the differences there in each case. (As I said earlier in this thread, the aggregate coverage in the header of the build page should reflect the data in the RUN DETAILS section, per the formula for coverage. Whereas it won't always match the coverage in the source tree, based on whether or not branch coverage is being included in aggregate coverage.):
Before: In this case, I suspect the
Result of (manually) re-running the build: After: COVERAGE CORRECTED ITSELF:
Same situation as the previous example. After (manually rerunning build): COVERAGE CORRECTED ITSELF:
Again, after forcibly re-running the build: COVERAGE CORRECTED ITSELF: The issue that concerns me right now is that the workaround not working for you. Again, my context is the workaround, since I'm not currently working on the fix for the root cause. (Someone else is.) I will try to understand and fix why the |
This seems to be in direct contradiction to what you said about this earlier in this thread when I asked the same question:
Work-around PR
Open PR with the work-around: PHPCSStandards/PHPCSUtils#342
Notes:
|
You're right. I can't think of a reason I would have said that unless I thought you meant that's the right token to use for the Github Action, rather than for the Rerun Build Webhook. I am truly sorry about that.
Thank you. Digging in now... |
Let me address these example builds, first:
So, the first example build: Is one of the ones I manually re-ran. So it now matches the coverage of its base build, as expected: But it's no surprise to me that it didn't fix itself after the original CI run, because I see that the Rerun Build call failed with In any case, so much for that one. As for the second build you mention: That is not one of the builds I manually re-ran earlier today. But the coverage does match that of its base build: Since we know from the transcript you provided for that build, that the Rerun Build Webhook also failed there, with As you said:
That's good news. It would be even better news if that behavior was consistent now. We have been tackling a lot of issues lately, so I suppose it's possible the key factor in this issue's root cause was fixed. |
Right. I saw that. Did it once include branch coverage? If so, I wonder if that indicates branch coverage as a factor in this issue's root cause. I will def share that internally. Thanks for pointing it out.
That's good. As I say, please let me know if it's consistent or not. 🤞
Ok, let me take that one separately... I'm not sure what's going on there, but what I noticed right away is that those files are duplicates: I assume that's not how it looks in your Github repo? |
@afinetooth Thanks for looking into this further.
I'll try and find the time over the next week to figure out setting the repo token via secrets and will update the PR to use the repo token.
❤️ That would be great news, but let's wait and see for a bit to see if the results hold.
None of the repos I'm involved with have ever send branch coverage to Coveralls. I do use it locally, but use a separate config file for that as branch coverage is only available in recent versions of the test tooling I use (PHPUnit 9.3+), while the tests in CI are run on a wide range of PHPUnit versions (and older PHPUnit versions would not even run if the config included the settings to enable branch coverage).
Correct, the repo doesn't contain duplicate files. You can see for yourself: https://github.com/PHPCSStandards/PHPCSExtra/tree/develop/Universal/Sniffs |
P.S.: oh and just to be clear - those "blank" results with duplicate files being recorded wasn't a one off, it's been happening regularly:
|
The aggregate coverage results reported back by Coveralls for PRs have been off for quite a while now. While the Coveralls team has not found the root cause yet, they have put an endpoint in place which _should_ help in working around this. The short of it is that the initial run results get calculated incorrectly, but that a re-run of the same coveralls build gets the correct results for the calculation, so triggering this calculation re-run from within the build _should_ (theoretically) fix the issue. I am the first to implement this solution though, so consider this a trial for the time being. Ref: * lemurheavy/coveralls-public#1631
The aggregate coverage results reported back by Coveralls for PRs have been off for quite a while now. While the Coveralls team has not found the root cause yet, they have put an endpoint in place which _should_ help in working around this. The short of it is that the initial run results get calculated incorrectly, but that a re-run of the same coveralls build gets the correct results for the calculation, so triggering this calculation re-run from within the build _should_ (theoretically) fix the issue. I am the first to implement this solution though, so consider this a trial for the time being. Ref: * lemurheavy/coveralls-public#1631
The aggregate coverage results reported back by Coveralls for PRs have been off for quite a while now. While the Coveralls team has not found the root cause yet, they have put an endpoint in place which _should_ help in working around this. The short of it is that the initial run results get calculated incorrectly, but that a re-run of the same coveralls build gets the correct results for the calculation, so triggering this calculation re-run from within the build _should_ (theoretically) fix the issue. I am the first to implement this solution though, so consider this a trial for the time being. Ref: * lemurheavy/coveralls-public#1631
@jrfnl came across this one from way-back while performing a google search! Regarding your last message, I had a look at one of the builds and confirmed the duplicate files: That's an old build, but is this still happening anywhere for you? I think you may have since migrated from php-coveralls to Coveralls GitHub Action (though not all projects perhaps), so, whether that's the case, or not, I'm pretty sure I understand the root cause for those happening back then, as now, so I can probably help if it's still an issue anywhere. Please let me know. |
@afinetooth Blast from the past ;-) Though reading back some of the above discussion does go a long way to explain why there was so much discussion and confusion about whether to use the
I don't recall seeing these issues anymore in the past six months or so. A few promille off once in a while, but nothing like I was seeing when I opened this issue and no duplicate files either.
Correct and for all projects, though there may still be some open PRs from that series. The vast majority of PRs has gone through and been merged though.
At this time, I think this issue can be closed, at least for those things I reported. I can't speak for others who reported similar issues in this thread. |
I'm seeing some very weird and wildly incorrect results coming in on PR builds, which is making Coveralls unreliable and disruptive as it means merges are either blocked because the Coveralls build is unjustly failing/merging with confidence is no longer possible.
To illustrate:
Last build against the default
develop
branch for a repo:Link: https://coveralls.io/builds/48211375
Build for a PR against
develop
which is intended to increase code coverage:Link: https://coveralls.io/builds/48212375
Report in the PR:
Link: PHPCSStandards/PHPCSUtils#308
What's wrong
As you can see from the above screenshots, the PR in actual fact makes the code coverage go UP (from 96.71% to 97.3%).
The number at the top of the PR build is incorrect (88.95%) , but that is the number which is being reported to GitHub causing the build to fail.
Not sure what's going on - I haven't changed any repo settings on Coveralls recently or anything.
Help ?
The text was updated successfully, but these errors were encountered: