add --compatible diff flag to output a diff more compatible with other tools #647

infokiller · 2022-12-12T07:47:01Z

…r tools For context: dandavison/delta#1256

infokiller · 2023-06-10T05:50:53Z

@vidartf did you get a chance to look at this? thanks!

vidartf · 2023-11-06T17:45:34Z

I don't really understand what this change is doing. If its trying to output the diff as a proper unified diff, we would need to map the json diffs back to line/character numbers in the original file. There is no trivial way to make this happen. So I assume this change is trying to make some compromise/middle-ground, but not sure exactly what. Please add more details to the PR / command's help key.

infokiller · 2023-11-10T07:49:49Z

I don't really understand what this change is doing. If its trying to output the diff as a proper unified diff, we would need to map the json diffs back to line/character numbers in the original file. There is no trivial way to make this happen. So I assume this change is trying to make some compromise/middle-ground, but not sure exactly what. Please add more details to the PR / command's help key.

@vidartf sorry for the lack of details, I will try to explain the problem and if you are OK with the solution, I'll also update the CLI help. The problem is described in dandavison/delta#1256 which is an issue I opened because https://github.com/dandavison/delta (syntax highlighting for diffs) didn't work with nbdiff. You can look at the issue for the full details, where the original creator of unified diff also commented there). The TLDR is:

The output of nbdiff is not compliant with unified diff
Specifically, the problem with delta is the hunk header was missing (line starting with @@)
I think it's not strictly mandatory to have the correct line/char numbers to be considered "compliant", and in the case of a notebook I guess it may not be easy or even possible
The unified diff author suggested a different format which avoids ## lines in order for them to be recognized by unified diff parsers, see 🐛 does not work with nbdiff (from https://github.com/jupyter/nbdime) dandavison/delta#1256 (comment)

vidartf · 2023-11-10T08:14:04Z

Thanks for clarifying! Having an output that is more compatible with unified diff does indeed sound useful 👍 Retaining the current default, and putting the new behavior behind a flag sounds good. Since your main motivation here is to have it be parsed by other tools, I think it can be hard to change this output format after initial release. With that in mind, I would suggest the following:

change the name of the flag to represent the target, i.e. something with "unified diff" in it. Maybe a --diff-format=unified flag in case there are others to add in the future?
Add some unit tests. We have a decent suite of notebooks producing a large variety of different diffs, so ensuring they can all be parsed as unified diff seems like a good first step, and then we will also want to test the actual contents of a good few of these. We should probably add baseline testing support for all of our diff tests now that I think of it... (I.e. record the current diff outputs of diffs in the repo to pick up changes from future commits). We can add the baselining, but would appreciate help in testing for unified diff readability and sanity.

infokiller · 2023-11-10T09:09:05Z

Thanks @vidartf for the quick and helpful response. You suggestions sound good. As for tests, I assume they should go into https://github.com/jupyter/nbdime/tree/master/nbdime/tests is that right?
Are there any specific tests that are good references for what you'd like to see?
Also, should these use as input the higher level ops, or raw notebook?

vidartf · 2023-11-10T19:32:59Z

Yes, I would probably make a new file, and then you could start with something like this:

def test_notebook_diff(any_nb_pair):
    "Test unified diff output on any pair of notebooks in the test suite."
    a, b = any_nb_pair
    diff = diff_notebooks(a, b)

    output = []
    class Printer:
            def write(self, text):
                output.append(text)

    argv = []  # your arguments for diff CLI here
    arguments = _build_arg_parser().parse_args(argv)
    config = prettyprint_config_from_args(args, out=Printer())
    pretty_print_notebook_diff(a.name, b.name, a, diff, config)

    assert "".join(output) == expected_output

add --compatible diff flag to output a diff more compatible with othe…

b48a522

…r tools For context: dandavison/delta#1256

infokiller mentioned this pull request Dec 12, 2022

🐛 does not work with nbdiff (from https://github.com/jupyter/nbdime) dandavison/delta#1256

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add --compatible diff flag to output a diff more compatible with other tools #647

add --compatible diff flag to output a diff more compatible with other tools #647

infokiller commented Dec 12, 2022

infokiller commented Jun 10, 2023

vidartf commented Nov 6, 2023

infokiller commented Nov 10, 2023

vidartf commented Nov 10, 2023 •

edited

infokiller commented Nov 10, 2023

vidartf commented Nov 10, 2023

add --compatible diff flag to output a diff more compatible with other tools #647

Are you sure you want to change the base?

add --compatible diff flag to output a diff more compatible with other tools #647

Conversation

infokiller commented Dec 12, 2022

infokiller commented Jun 10, 2023

vidartf commented Nov 6, 2023

infokiller commented Nov 10, 2023

vidartf commented Nov 10, 2023 • edited

infokiller commented Nov 10, 2023

vidartf commented Nov 10, 2023

vidartf commented Nov 10, 2023 •

edited