Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Extracted from #1209 and probably the most controversial piece.
The goal
I would like to come up with a nomenclature which can guide some Infection users and contributors to get started or graps some concepts which are re-used in our codebase.
The ultimate goal is to push for a non infection-specific nomenclature which is agreed on by most (and not just us) regarding mutation testing. This is however something that will take time, energy, discussions and more importantly that I have no interest to conduct/contribute if the Infection team does not agree to adhere to the terms laid out in this nomenclature.
I cherry-picked some elements and tried to come up with a comprehensible sentence for each of them. Not however that the nomenclature propose does not match our current codebase terminology. Updating the codebase accordingly is a different task that can be done after we agreed on the nomenclature.
This PR is mostly about agreeing on the terms. Changes to the form, adding sources & co. to the nomenclature are welcome but I would like to keep this in a separate PR to reduce the noise
The controversial changes
Mutagenesis
From @maks-rafalko:
Source: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46584.pdf
Mutant vs Mutation
As mentioned in #1209 I found Mutant & Mutation usage in Infection confusing since day one. It does not mean that the usage is wrong, but to me it indicates that either the difference of our two classes is not strong enough (i.e. a design issue) our we have a terminology misusage.
I now tend to lean towards the second one. To sum up my opinion:
*: you can see that based on that technical distinction, often when you are interested in to change in general and do not care if it is applied or not, it does not matter which of the two terms you employ.
@maks-rafalko rebuttal (just to be clear this is not a declaration of war against @maks-rafalko, he was however the only one voicing his opinion in #1209 so I only have him to quote 馃槄 )
My answer:
IMO here when we talk of equivalent, higher order or second order mutants, it's a common (and forgiving unless you want to be pedantic) mix up. But you would change "mutant" by "mutation" up there I don't think you would confuse anyone.
From it:
[...] Each mutated version is called a mutant and tests detect and reject mutants by causing the behavior of the original version to differ from the mutant. This is called killing the mutant. [...]
IMO "mutated version" = "New program that differs from the original by applying a mutation." fits perfectly here. To be more clear: it's not saying "mutated version that is not applied yet that we may or not apply to change the program and run the tests against".
To extract the interesting paragraph:
[...] Such faults are deliberately seeded into the original program, by a simple syntactic change, to create a set of faulty programs called mutants, each containing a different syntactic change. [...]
To me, a "faulty program" fits perfectly the definition we gave to Mutant here.
To copy the text linked there:
We'll call a single small change to the source code a mutation. [...] The result of a mutation is called a mutant. In the example above, the code to the left part is part of the original source code, and the code to the right part of the mutant
IMO in this context it is more ambiguous, but again, if you consider that the result is the faulty program, it fits perfectly. From [...] the code to the right part of the mutant alone I definitely agree this hints more on mutant = mutation, but then it would be interesting to see what the author consider the mutation to be in this case? So in all honesty, I think "mutation" would be more correct than "mutant" in that sentence. Wether this was intended or a mistake I cannot speak for the author and in all honesty I would not hold it against the author for such a mistake.
From Pitest as @maks-rafalko mentioned:
By applying the mutation operators PIT will generate a number (potentially a very large number) of mutants. These are Java classes which contain a mutation (or fault) which should make them behave differently from the unmutated class.
I admit I'm not 100% clear how Pitest & Java operates... If "mutants. These are Java classes which contain a mutation" actually translates to forking the code and applying the changes there (remember that Java is compiled and IIRC Pitest looks at the bytecode) then it actually still fits; otherwise no.
I did not ping all the maintainers, but from my poll the following projects were agreeing (and no one raise a voice to disagree):
I would also add that mutation testing is very niche, so it is within our [mutation testing implementors] power to change and define new terms regardless of what the academia and Wikipedia says. If an academician or researcher comes up with a term no implementor ever use or refers to, it is but going to be a dead term. This is not to say we should reject all of what comes there, but rather to say we do not have to take that as the ultimate source of truth either.