Introduce a nomenclature #1428

theofidry · 2020-11-12T17:03:25Z

Extracted from #1209 and probably the most controversial piece.

The goal

I would like to come up with a nomenclature which can guide some Infection users and contributors to get started or graps some concepts which are re-used in our codebase.

The ultimate goal is to push for a non infection-specific nomenclature which is agreed on by most (and not just us) regarding mutation testing. This is however something that will take time, energy, discussions and more importantly that I have no interest to conduct/contribute if the Infection team does not agree to adhere to the terms laid out in this nomenclature.

I cherry-picked some elements and tried to come up with a comprehensible sentence for each of them. Not however that the nomenclature propose does not match our current codebase terminology. Updating the codebase accordingly is a different task that can be done after we agreed on the nomenclature.

This PR is mostly about agreeing on the terms. Changes to the form, adding sources & co. to the nomenclature are welcome but I would like to keep this in a separate PR to reduce the noise

The controversial changes

Mutagenesis

From @maks-rafalko:

Also, didn't see any mentions of Mutagenesis before in any of the "official" docs. Where does it come from?

Source: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46584.pdf

Mutant vs Mutation

As mentioned in #1209 I found Mutant & Mutation usage in Infection confusing since day one. It does not mean that the usage is wrong, but to me it indicates that either the difference of our two classes is not strong enough (i.e. a design issue) our we have a terminology misusage.

I now tend to lean towards the second one. To sum up my opinion:

For the grand public, "mutation" and "mutant" are two interchangeable terms and often, even within more mutation testing specific talks, in most context they can be interchangeable*
After a bit more investigation and talk with other mutation testing related tool authors, there is a worthy technical distinction: mutation designate the change that can be done whilst mutant designates the mutated_program, i.e. where the mutation has been applied

*: you can see that based on that technical distinction, often when you are interested in to change in general and do not care if it is applied or not, it does not matter which of the two terms you employ.

@maks-rafalko rebuttal (just to be clear this is not a declaration of war against @maks-rafalko, he was however the only one voicing his opinion in #1209 so I only have him to quote 😅 )

<I'm afraid I don't understand this renaming. Every resource about mutation testing have a definition of Mutant, which is now just removed.

A couple of well known ones:

Equivalent Mutant

High Order Mutant

Second Order Mutant

etc.

Also

Wikipedia's article has this definition

Mutation Testing Repository 's articles use this definition

https://leanpub.com/mutationtesting book distinguish Mutant and Mutation

So, merging these 2 different things makes the things even more confusing IMO (these are my thought not looking into the code yet).

My answer:

IMO here when we talk of equivalent, higher order or second order mutants, it's a common (and forgiving unless you want to be pedantic) mix up. But you would change "mutant" by "mutation" up there I don't think you would confuse anyone.

Wikipedia's article has this definition

From it:

[...] Each mutated version is called a mutant and tests detect and reject mutants by causing the behavior of the original version to differ from the mutant. This is called killing the mutant. [...]

IMO "mutated version" = "New program that differs from the original by applying a mutation." fits perfectly here. To be more clear: it's not saying "mutated version that is not applied yet that we may or not apply to change the program and run the tests against".

Mutation Testing Repository 's articles use this definition

To extract the interesting paragraph:

[...] Such faults are deliberately seeded into the original program, by a simple syntactic change, to create a set of faulty programs called mutants, each containing a different syntactic change. [...]

To me, a "faulty program" fits perfectly the definition we gave to Mutant here.

https://leanpub.com/mutationtesting book distinguish Mutant and Mutation

To copy the text linked there:

We'll call a single small change to the source code a mutation. [...] The result of a mutation is called a mutant. In the example above, the code to the left part is part of the original source code, and the code to the right part of the mutant

IMO in this context it is more ambiguous, but again, if you consider that the result is the faulty program, it fits perfectly. From [...] the code to the right part of the mutant alone I definitely agree this hints more on mutant = mutation, but then it would be interesting to see what the author consider the mutation to be in this case? So in all honesty, I think "mutation" would be more correct than "mutant" in that sentence. Wether this was intended or a mistake I cannot speak for the author and in all honesty I would not hold it against the author for such a mistake.

From Pitest as @maks-rafalko mentioned:

By applying the mutation operators PIT will generate a number (potentially a very large number) of mutants. These are Java classes which contain a mutation (or fault) which should make them behave differently from the unmutated class.

I admit I'm not 100% clear how Pitest & Java operates... If "mutants. These are Java classes which contain a mutation" actually translates to forking the code and applying the changes there (remember that Java is compiled and IIRC Pitest looks at the bytecode) then it actually still fits; otherwise no.

I did not ping all the maintainers, but from my poll the following projects were agreeing (and no one raise a voice to disagree):

mull
mutant
mutmut

I would also add that mutation testing is very niche, so it is within our [mutation testing implementors] power to change and define new terms regardless of what the academia and Wikipedia says. If an academician or researcher comes up with a term no implementor ever use or refers to, it is but going to be a dead term. This is not to say we should reject all of what comes there, but rather to say we do not have to take that as the ultimate source of truth either.

theofidry · 2020-11-12T17:07:08Z

@maks-rafalko just in case I am a bit tilted at the moment and unfortunately you are the only one I can quote here about disagreements [because you are the only one who did articulate points in #1209]. So although I tried to read myself to not say any BS, if there is anything brushing you off the wrong way here it's really not intended 🙇

theofidry · 2020-11-12T17:13:17Z

I posted an excerpt in the new Mutation Testing Discord, I will report any feedback here

theofidry · 2020-12-16T08:32:36Z

Feedback from Discord is somewhat mixed regarding Mutation vs. Mutant:

Regardless of the outcome, what is clear is the difference is subtle and in most cases does not matter hence should not impact the docs or literature in any way. I.e. they are interchangeable there
Stryker is the only codebase reporting to have Mutation & Mutant. But they are actually in the same situation as us: their Mutant is Mutation + some extra data to represent the result of the execution. They are not most content with it but it doesn't appear to be a big discomfort for them either
Most however, stryker included, agrees on the Mutation vs. Mutant distinction although not all are willing to do a change in the codebase because we put a nomenclature down (which is perfectly fine)

Introduce a nomenclature

a677450

sanmai added the DX Developer Experience label Nov 13, 2020

maks-rafalko self-requested a review November 13, 2020 04:33

theofidry added 2 commits December 16, 2020 09:21

Update nomenclature.md

7dda9b7

Update nomenclature.md

c241657

theofidry requested review from BackEndTea and sanmai December 16, 2020 08:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce a nomenclature #1428

Introduce a nomenclature #1428

theofidry commented Nov 12, 2020

theofidry commented Nov 12, 2020 •

edited

theofidry commented Nov 12, 2020

theofidry commented Dec 16, 2020

Introduce a nomenclature #1428

Are you sure you want to change the base?

Introduce a nomenclature #1428

Conversation

theofidry commented Nov 12, 2020

The goal

The controversial changes

Mutagenesis

Mutant vs Mutation

theofidry commented Nov 12, 2020 • edited

theofidry commented Nov 12, 2020

theofidry commented Dec 16, 2020

theofidry commented Nov 12, 2020 •

edited