Add a structured "report", probably JSON #720

nedbat · 2018-10-09T10:30:19Z

There are lots of use cases that want detailed data from coverage. The only way they have to get it today is from the XML report, which is limited by being a Java-native format.

A new report which included all of the information in a convenient machine-readable form would enable more ater-tools to be built for coverage. @atodorov started work on a JSON report here: https://bitbucket.org/ned/coveragepy/pull-requests/61/add-json-command/diff

atodorov · 2018-10-09T10:34:46Z

@nedbat it's been a while but can you summarize the status of the code on the BitBucket pull request.

If there are only a few things left to do (like resolving conflicts and minor issues) I may be able to pick it up where I left it and send a PR here on GitHub.

nedbat · 2018-10-09T10:39:47Z

@atodorov to be honest, I haven't looked at it closely in a while, but thanks. If you move it here, we can start reviewing it.

Bachmann1234 · 2019-07-09T04:38:01Z

I was curious and it looked like this issue was stale so I started futsing with this by branching off the 4x line. https://github.com/Bachmann1234/coveragepy/tree/json_report_4

my POC basically duplicated the XML report and replaced al the XML talk with dict calls and called json.dumps

The result report looks like https://gist.github.com/Bachmann1234/6347d0db7437e71323a2c391e3031ac0 (assuming I did it right. Its a spike right now and would need more tests/verification)

I think doing this the right way though would include a method that would not include so much duplication. Perhaps creating a dictionary then having the json report dump it and for the XML report simply use that dictionary to drive the rendering.

Though ultimately it comes down to the desires for the json report I guess? Is keeping it close to the cobertura xml appropriate? 🤷‍♂

anyways ill keep playing with this

nedbat · 2019-07-09T11:23:11Z

@Bachmann1234 thanks, I think people would appreciate a supported structured report format. It's probably ok to be working from the 4.x line, since the code you are looking at hasn't changed much. I think you are right that a JSON report and the XML report might share a data gathering phase, and then write the data differently.

I recently did something similar to the HTML report in 5.x, which made it easier to test.

BUT: I don't want to follow the Cobertura schema: they focus on "classes", which isn't right for Python. Let's get a data format that is right for coverage.py, and then assess how well a common data gathering phase would work. Don't for

And we need to deal with the new data in 5.0: contexts, including dynamic contexts. It's not clear how best to handle it, because it could balloon quickly. Should a JSON report be compact (in some kind of 3rd-normal form) or convenient (which could involve duplicating a lot of data)?

I'm really glad you are looking into this. Let's keep talking through the design, it will make a lot of people happy to have something like this.

Bachmann1234 · 2019-07-09T13:23:06Z

I'll keep thinking out loud in case life happens and I fade away.

So python has packages, modules, classes, functions, lines

Not sure how to handle the recursive nature quite yet. (Functions can contain functions...). My instinct is trying to provide that much detail may just make the report hard to use.

Maybe just focusing on package, module, line?

I'll take a look at the HTML and console report and see if they are better guides.

nedbat · 2019-07-09T14:49:01Z

Internally, coverage.py deals with files, lines, arcs (jumps from line to line for branch coverage), and contexts. For a start, the JSON report only has to do a good job getting the existing data out in a supported structured form.

Bachmann1234 · 2019-07-09T15:09:07Z

sounds like a plan! -Matt Bachmann

…

On Tue, Jul 9, 2019 at 10:49 AM Ned Batchelder ***@***.***> wrote: Internally, coverage.py deals with files, lines, arcs (jumps from line to line for branch coverage), and contexts. For a start, the JSON report only has to do a good job getting the existing data out in a supported structured form. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#720?email_source=notifications&email_token=AALTLROLNUF3IONM3TU7CWDP6SQN7A5CNFSM4F2LBVV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZQQFDI#issuecomment-509674125>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AALTLROQETI2VXBV2BANV7DP6SQN7ANCNFSM4F2LBVVQ> .

Bachmann1234 · 2019-07-10T03:08:50Z

Quick update. based on feedback I updated my branch to be based master (rather than 4.x. I fixed the issue I was having) and made a simplified json report. The branch is living here https://github.com/Bachmann1234/coveragepy/tree/json_report

Currently it's just built using the public api. Next I need details that are not currently in any public api.

The branch is not tested right now. Mostly because I kinda assume the "schema" for this is going to iterate a bunch. Once we start getting close to something that would be useful then I can start locking down the behavior with tests

atodorov · 2019-07-10T07:31:40Z

Folks, I'm just letting you know I won't be able to work on this at all. OTOH you may want to check-out this thread on PyCQA mailing list:
https://mail.python.org/pipermail/code-quality/2018-September/001065.html

Bachmann1234 · 2019-07-10T11:51:00Z

It's an interesting spec. As someone who manages a tool that works with a lot of different static analysis tools I love the idea of all of them using a unified machine parsable format.

I'm not super convinced sarif is a good fit for a coverage tool. It seems more designed for rule based linters where you have a set of rules and you are reporting on sections of code violating those rules.

It's always been my experience with coverage tools that they mostly are there to draw a picture and the interpretation of that picture is left for other tools.(features like 'fail under' notwithstanding)

This json report would provide a picture. I could imagine another tool which could define coverage rules for parts of a codebase that would take in this report and spit out a sarif style report. But that's probably a project outside of this one.

Bachmann1234 · 2019-07-12T03:42:43Z

Status report:

Here is what I have for a simple report. Nothing involving contexts or branch coverage yet (im still wrapping my head around arcs...)

{
    "version": "5.0a6",
    "timestamp": "1562902290",
    "measured_files": [
        {
            "measured_file": "example.py",
            "missing_lines": [
                5
            ],
            "executed_lines": [
                1,
                2,
                4,
                5,
                7,
                8
            ],
            "summary": {
                "missing_lines": 1,
                "covered_lines": 5,
                "num_statements": 6
            }
        }
    ],
    "totals": {
        "missing_lines": 1,
        "covered_lines": 5,
        "num_statements": 6
    }
}

For a a run on a non trivial script here is a gist

https://gist.github.com/Bachmann1234/88eb0c941112550034cc8de62ca3a9d7

im fairly happy how the code shook out and it was not as much as a hack spike as I was worried it would be. So Im actually gonna write tests for what I have and aim to put up a WIP PR this weekend.

Bachmann1234 · 2019-07-12T04:22:37Z

So I thought I was done for the night but I kinda kept messing around.

I added branch coverage stats and context stuff. I still think there is a lot to do. Tests, docs, settings around the report. Iterating on the structure of the report itself. etc etc. But I am excited :-)

Anyways, here is a version with branch coverage
https://gist.github.com/Bachmann1234/b251cbfac7033ba52d56c87e3e243696

Here is a version without branch
https://gist.github.com/Bachmann1234/a3ec2ddbcbeed621e8d756b917d48ce1

I still dont quite get arcs. They are pairs of line numbers... but when I look at the data there are a lot of negative numbers. I don't know what that means.

Bachmann1234 · 2019-07-12T04:36:29Z

Note to self. "Measured_files" should be a dict keyed on relative file path. Probably far more useful than a list.

nedbat · 2019-07-13T22:28:26Z

Thanks for keeping on this. It might be a little easier to look at schemas schematically (so to speak) rather than as a full data file.

In arcs, a negative first number -N means "entered a function starting at line N", and a negative second number -N means, "exited a function starting at line N."

We should have percentages also, like the final column from an HTML report.

Maybe "measured_files" should just be "files", and have it be an object with the file name as the key?

nedbat · 2019-08-31T11:26:01Z

#825 added a JSON report.

nedbat · 2019-09-21T21:01:52Z

This is in 5.0a7.

nedbat added the enhancement New feature or request label Oct 9, 2018

nedbat mentioned this issue Oct 9, 2018

Improve fail_under by enabling more refined control on threshold #666

Open

nedbat mentioned this issue Jun 18, 2019

Fail coverage based on absolute number of uncovered lines #815

Open

Bachmann1234 mentioned this issue Jul 15, 2019

Json report #825

Merged

nedbat closed this as completed Aug 31, 2019

vorpal-buildbot mentioned this issue Dec 15, 2019

Update coverage to 5.0 PennyDreadfulMTG/Penny-Dreadful-Tools#6937

Merged

nedbat added the fixed label Mar 10, 2020

Paebbels mentioned this issue Jan 12, 2024

JSON coverage files should contain a format version #1732

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a structured "report", probably JSON #720

Add a structured "report", probably JSON #720

nedbat commented Oct 9, 2018

atodorov commented Oct 9, 2018

nedbat commented Oct 9, 2018

Bachmann1234 commented Jul 9, 2019

nedbat commented Jul 9, 2019

Bachmann1234 commented Jul 9, 2019

nedbat commented Jul 9, 2019

Bachmann1234 commented Jul 9, 2019 via email

Bachmann1234 commented Jul 10, 2019 •

edited

atodorov commented Jul 10, 2019

Bachmann1234 commented Jul 10, 2019 •

edited

Bachmann1234 commented Jul 12, 2019

Bachmann1234 commented Jul 12, 2019 •

edited

Bachmann1234 commented Jul 12, 2019

nedbat commented Jul 13, 2019

nedbat commented Aug 31, 2019

nedbat commented Sep 21, 2019

Add a structured "report", probably JSON #720

Add a structured "report", probably JSON #720

Comments

nedbat commented Oct 9, 2018

atodorov commented Oct 9, 2018

nedbat commented Oct 9, 2018

Bachmann1234 commented Jul 9, 2019

nedbat commented Jul 9, 2019

Bachmann1234 commented Jul 9, 2019

nedbat commented Jul 9, 2019

Bachmann1234 commented Jul 9, 2019 via email

Bachmann1234 commented Jul 10, 2019 • edited

atodorov commented Jul 10, 2019

Bachmann1234 commented Jul 10, 2019 • edited

Bachmann1234 commented Jul 12, 2019

Bachmann1234 commented Jul 12, 2019 • edited

Bachmann1234 commented Jul 12, 2019

nedbat commented Jul 13, 2019

nedbat commented Aug 31, 2019

nedbat commented Sep 21, 2019

Bachmann1234 commented Jul 10, 2019 •

edited

Bachmann1234 commented Jul 10, 2019 •

edited

Bachmann1234 commented Jul 12, 2019 •

edited