-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Help understanding coverage decrease #1632
Comments
hi @jinhong-, I took a guess at the project and build you're referencing above. (I'm using it here since only you and your team members can access that link.) Looking at the PR build (RIGHT), compared to the PR's base build (LEFT): I'm also confused at the coverage change, since I don't see any indicators of that difference in the files themselves, only in the RUN DETAILS, which correlates with the change. But I suspect those numbers are wrong and that the build may have been corrupted, so I re-ran the parallel jobs in the order they arrived, and got a new result (a new coverage calculation), which confirms that. There is now NO CHANGE in coverage between the two builds: I'm afraid there's little to go on when it comes to one-off corrupted builds, in terms of determining root cause. We would need to a see a pattern, so please let us know, here, or at support@coveralls.io, if this kind of result persists. In the meantime, to answer your other question:
The reason for that is that the FILES section only displays the line coverage in your project's coverage report, whereas your Coveralls repo, as configured in its SETTINGS, also tracks the branch coverage in your coverage reports and considers it in calculating total (aggregate) coverage. The aggregate coverage is reflected in RUN DETAILS, and you'll note that yours includes branch coverage details: Here's the formula for aggregate coverage when branch coverage is included:
And here's the branch coverage setting in your project SETTINGS: |
Thanks! We will keep observing. It seems to be happening fairly frequently in terms of mis reporting of coverage. Does the order of parallel execution matter? |
Here's another that failed with 0% again |
@jinhong- Yes, I see the same behavior again. But I don't see any underlying reason for it. There is nothing out of order with your coverage posts, or how they came in. They are in a different order in the PR, but we're aware of that. (Coveralls knows that the previous job for build 2 in the PR build is job 3 in the base build, etc.) The "reasoning" for the drop in coverage comes from the RUN DETAILS, and you can see how that is purported to change between the base build (LEFT) and the PR build (RIGHT), here: Those RUN DETAILS are supposed to be the un-modified details from your coverage reports. So the first thought about root cause is an issue with your reports. But I can eliminate that if I re-run your PR build and get more accurate results, like last time. Which I did and... Again, the RUN DETAILS changed after Coveralls re-consumed each of your coverage reports (jobs). So we have our pattern. But unfortunately I still can't name the cause. Obviously, there's something interfering in the consumption of coverage reports the first time around. The next step in terms of diagnosing would be for you to invoke your next builds in verbose mode and share with me your CI build logs. (At least the portions related to Coveralls.) I know your project is private, so feel free to share those to support@coveralls.io and just mention this issue. I will look for them. Here's how to enable verbose mode for the Coveralls Github Action:
Thanks. |
I have sent the logs over to you |
@jinhong- Got them Thanks. Will reply in email but backfill any details here that will help others. |
Thanks, @dhui. @jinhong- is that a viable workaround for you for the time being? In your SETTINGS, you would enter 0.1 into the COVERAGE DECREASE THRESHOLD FOR FAILURE field, like so: We are trying to determine what kind of debug info / monitoring would help us understand what's happening with your initial builds that don't calculate properly. |
Unfortunately that may not help us as for our case, the coverage seems to drop down to zero |
@afinetooth how are you re-running the tests in the order they are arriving? Are you able to expose this functionality for me to re-run? I am facing this issue fairly frequently |
My theory is that the analysis is timing out/erroring out on your backend, and the behavior of timing out is to have coverage reported at 0%. I observed that the results took longer than usual to arrive. I am assuming there is some background processing involved |
@jinhong- Unfortunately it's not something I can expose for you to trigger. Right now, it's just an internal command I can execute via dev console, so not available via API or anything. It is planned for future release, but probably not on a timeline to be of use here. Your theory is reasonable. To test it I ran a report of your last 100 builds (attached---it's anonymized.) and your build times all look normal, except for the original build ID referenced above: 48890929. It's one of the only builds with a longer build time and in that case the build time is an extreme outlier. Maybe you can look through the file and see if the IDs of any more of your problem builds match longer build times. (I don't really see any builds that took as long at the one mentioned, though.) Note that the build ID is what's displayed in the URL for your build, not the label given by your CI service that appears on your build pages. |
Also, I noticed the exact same PR/build (2345857520) would first fail with 0%, then pass afterwards. Did you trigger a re-run for build 2345857520? Few questions on the CSV file
|
Hi @chapayevdauren. I'll need to know the Coveralls URL for your repo, or the URL for the problematic build. If it's private, or sensitive, please email support@coveralls.io and mention this issue. I'll get it and reply. |
@chapayevdauren — replied in email. |
@afinetooth |
Thanks @dhui, for the update. @dhui and @chapayevdauren, I also have an update: We don't currently understand the root cause, but we think it may be due to some recent issues described on our status page: That said, the normal behavior would be for the builds to take longer than normal, not complete incorrectly. So if the above is a cause, it's for a different reason, such as the calculation job failing before it can obtain its data, due to a timeout, etc. Will share updates here. |
@jinhong- @dhui @chapayevdauren @ Workaround for this issue: the Rerun Build WebhookWhile we've not yet identified a fix for this issue, we released a workaround today that should resolve it for you: the Rerun Build Webhook. Since the nature of the issue appears to be that, for some repos with parallel builds:
A Rerun Build Webhook, similar to the (Close) Parallel Build Webhook, fixes the issue by triggering your build to re-calculate itself. InstructionsCall this at the end of your CI config, after calling the (Close) Parallel Build Webhook. Call it like this:
But substitute your Please note a few differences between the Rerun Build Webhook and the (Close) Parallel Build Webhook:
|
NOTE: In case you're having trouble determining what If you're using a different Coveralls integration and/or are still having trouble determining the correct values for either |
Am struggling with understanding why coverage reports a decrease when there is no change in codes that affects the coverage. Attached the screenshot below. The number reported in the Tree is different from the coverage decrease reported
The text was updated successfully, but these errors were encountered: