New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Mutant #1209
base: master
Are you sure you want to change the base?
Remove Mutant #1209
Conversation
…ts() to hasTests()
I'm not so sure |
Still better than calling it a pullable-door :P I'm happy to take another name. Keep in mind that though that it is strictly internal to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a lot of changes here. I wish I could review functional changes separately from name changes.
I wanted that as well, but I also needed the final state to judge if was worth it or not. But my recommendation is to checkout locally
and frankly are quite minimal and uninteresting from a review PoV. It was really useful for me though to judge the final result and get an early feedback about the design |
Update as per @sanmai comments, this PR has changed. You can see it with the following rationale: I hope this makes the reasoning a bit easier, but the PR needs no more tweaking for being reviewed :) |
I'm afraid I don't understand this renaming. Every resource about mutation testing have a definition of Mutant, which is now just removed. A couple of well known ones:
Also
So, merging these 2 different things makes the things even more confusing IMO (these are my thought not looking into the code yet). Also, didn't see any mentions of Another concern is about HOM (Higher Order Mutations).
So, HOM is a combination of several mutations (N) which results in 1 Mutant. N:1 relations, not 1:1 which conflicts with your statement:
I still don't agree with the changes from #1207 that are presented here as well. Didn't have enough time to learn the code (though spent 20 minutes already), but at this point I'm -1 on these renaming. |
Because there it implies an applied changed or at best is a context where mutation & mutant are interchangeable.
https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46584.pdf
My answer to this statement is:
I'll check if done in mutant. If not done there, I highly doubt we'll ever have to worry about that. But supposedly, that would still be perfectly doable by instead of having a class
As mentioned, the main change really is |
Maybe let's put the problem another way: I don't think the current distinction between |
@sanmai @maks-rafalko I've updated the description as an attempt to better express myself and address some of the concerns mentioned. If there is any specific piece (e.g. @maks-rafalko you mentioned the renaming cases from #1207) I'll be happy to address them (in the case of those renaming cases @maks-rafalko I'll revert it as soon as I update that PR). I would however like a first review/consensus before spending more time updating this PR :) |
Rebased & reverted the namespace changes: so it requires another PR to fix some naming inconsistencies but that also lighten (a bit) the amount of stuff to review and opens up a dedicated space to discuss some naming concerns instead of doing it here. I'll appreciate another attentive review on it, it was also very time consuming for me to review, revert & rebase :) |
ping @infection/core |
👍 I think this makes the language of the project a bit simpler, as well as the code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I tried to dig into the code one more time, still have difficulties to understand the renaming.
I have a feeling that 2 things are mixed up here which really confuse me:
- probably the first one is moving needed methods / logic from one place to another, making Mutation & Mutant more clear.
- the second one is removing Mutant and renaming everything to Mutation which simply conflicts with definitions of Mutation Testing.
I hope I provided a good example in this comment: #1209 (comment)
); | ||
|
||
$mutationFilePath = sprintf( | ||
'%s/mutation.%s.infection.php', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to say I'm lost here.
Even from the nomenclature added in this PR, Mutation is:
The result of applying a mutator to the AST of a subject and represents a change to be applied.
But the result source code after applying a Mutation is called Mutant.
New program that differs from the original by applying a mutation.
IMO this conflicts with definitions from Mutation Testing theory (not even in Infection in particular)
After more digging into this patch, here is what we do in this PR:
$mutatedCode = $codeFactory->createCode();
# later in `MutationTestingRunner.php` we dump it to FS
$filesystem->dumpFile('mutation.%s.infection.php', $mutatedCode)
How it was before (from my POV - much better, note the variables / file names)
$mutatedCode = $mutantCodeFactory->createCode();
# later in `MutationTestingRunner.php` we dump it to FS
$filesystem->dumpFile('mutant.%s.infection.php', $mutatedCode)
Let's also see the name of $mutationFilePath
variable. How can Mutation have a file path? From nomenclature, Mutation "... represents a change to be applied", while Mutant is "New program that differs from the original by applying a mutation."
New program can have a file (where this program is located).
Change to be applied - not. (Mutation is "The result of applying a mutator to the AST of a subject").
We store AST in memory, and we dump the result of mutated AST (which is a Mutant) to a file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really thinking hard on this, so let's see if that makes sense.
The way I see it, the changed AST, a string diff, or a dumped file, is not the mutant, but the mutation.
I interpret "New program that differs from the original by applying a mutation." more as "the process with the mutated code loaded", i.e. in our case, the "process in which we intercept the autoloader to load the mutated code instead of the original".
Maybe this is too much of a stretch? But honestly this makes more sense to me and also has another merit: it is not specific to PHP/Infection. Dumping the mutated code to the FS is an Infection specificity due to how PHP works. Calling that [the dumped file] a mutant does not sound correct to me
@@ -81,19 +81,19 @@ protected function setUp(): void | |||
$this->fail('Input code cant be the same as mutated code'); | |||
} | |||
|
|||
$mutants = $this->mutate($inputCode, $settings); | |||
$mutatedCodes = $this->mutate($inputCode, $settings); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, the result of mutation is Mutant. As for me, the previous variable name is perfectly ok
$this->fileSystemMock | ||
->expects($this->exactly(2)) | ||
->method('dumpFile') | ||
->withConsecutive( | ||
['/path/to/mutant0', 'mutated code 0'], | ||
['/path/to/mutant1', 'mutated code 1'] | ||
['/path/to/mutation0.php', 'mutated code 0'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here - dumped new mutated code is called Mutant, not Mutation
@@ -66,35 +74,103 @@ public function __construct( | |||
$this->tmpDir = $tmpDir; | |||
$this->differ = $differ; | |||
$this->printer = $printer; | |||
$this->mutantCodeFactory = $mutantCodeFactory; | |||
$this->codeFactory = $mutantCodeFactory; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this renaming is needed. This is exactly mutant code factory
Are you saying that Mutant (ruby lib) is the source of truth for us? I doubt. If there is something is (not) implemented, I have no idea why we need to (not) do the same. |
It's not a source of truth. It is however a reliable source of a more mature version (9yo now?). I also like to take pitest as a reference. So that mutant does or doesn't do something doesn't translate in a need for us to imitate. It can however provide some insight on a usage. But to more broadly: we have yet to have anything concrete on higher order mutants, so trying to introduce an abstraction for the sole purpose of it at this point would be IMO unwise |
@maks-rafalko I totally get that feeling :) It's unfortunately a case where I feel both the point of contortion and confusion is about the naming. Because objectively code-wise this PR removes an extra abstraction we have no use of (at the very least for now) and (mostly as a result of that removal) simplifies things |
^ this is where we have a conflicted understanding of what the Mutant is. For example, here is how Wikipedia treat it: Another example from Mutation Testing book: Here is how Pitest defines it:
|
PitestMutant.java/**
* A fully generated mutant
*/
public final class Mutant {
private final MutationDetails details;
private final byte[] bytes;
public Mutant(final MutationDetails details, final byte[] bytes) {
this.details = details;
this.bytes = bytes;
}
/**
* Returns a data relating to the mutant
*
* @return A MutationDetails object
*/
public MutationDetails getDetails() {
return this.details;
}
/**
* Returns a byte array containing the mutant class
*
* @return A byte array
*/
public byte[] getBytes() {
return this.bytes;
}
} But no mbj/mutantmutation.rb# Represent a mutated node with its subject
class Mutation
include AbstractType, Adamantium::Flat
include Concord::Public.new(:subject, :node)
CODE_DELIMITER = "\0"
CODE_RANGE = (0..4).freeze
# Identification string
#
# @return [String]
def identification
"#{self.class::SYMBOL}:#{subject.identification}:#{code}"
end
memoize :identification
# Mutation code
#
# @return [String]
def code
sha1[CODE_RANGE]
end
memoize :code
# Normalized mutation source
#
# @return [String]
def source
Unparser.unparse(node)
end
memoize :source
# The monkeypatch to insert the mutation
#
# @return [String]
def monkeypatch
Unparser.unparse(subject.context.root(node))
end
memoize :monkeypatch
# Normalized original source
#
# @return [String]
def original_source
subject.source
end
# Test if mutation is killed by test reports
#
# @param [Result::Test] test_result
#
# @return [Boolean]
def self.success?(test_result)
self::TEST_PASS_SUCCESS.equal?(test_result.passed)
end
# Insert mutated node
#
# @param kernel [Kernel]
#
# @return [Loader::Result]
def insert(kernel)
subject.prepare
Loader.call(
binding: TOPLEVEL_BINDING,
kernel: kernel,
source: monkeypatch,
subject: subject
)
end
private
# SHA1 sum of source and subject identification
#
# @return [String]
def sha1
Digest::SHA1.hexdigest(subject.identification + CODE_DELIMITER + source)
end
memoize :sha1
# Evil mutation that should case mutations to fail tests
class Evil < self
SYMBOL = 'evil'
TEST_PASS_SUCCESS = false
end # Evil
# Neutral mutation that should not cause mutations to fail tests
class Neutral < self
SYMBOL = 'neutral'
TEST_PASS_SUCCESS = true
end # Neutral
# Noop mutation, special case of neutral
class Noop < Neutral
SYMBOL = 'noop'
TEST_PASS_SUCCESS = true
end # Noop
end # Mutation But no StrykerMutant.tsinterface Mutant {
mutatorName: string;
fileName: string;
range: Range;
replacement: string;
} No You also mention two sources from mutation testing book, mine comes from another. So I don't think there is a consensus here and admittedly, if you go to someone and say we have: $x = $a + $b; and there is a "X": $x = $a - $b MullMutationPoint.hclass MutationPoint {
Mutator *mutator;
const MutationPointAddress address;
Bitcode *bitcode;
llvm::Function *originalFunction;
llvm::Function *mutatedFunction;
std::string uniqueIdentifier;
std::string diagnostics;
std::string replacement;
const SourceLocation sourceLocation;
std::vector<std::pair<Test *, int>> reachableTests;
irm::IRMutation *irMutator;
public:
MutationPoint(Mutator *mutator, irm::IRMutation *irMutator, llvm::Instruction *instruction,
std::string replacement, Bitcode *m, std::string diagnostics);
~MutationPoint() = default;
Mutator *getMutator();
Mutator *getMutator() const;
const MutationPointAddress &getAddress() const;
llvm::Value *getOriginalValue() const;
Bitcode *getBitcode() const;
llvm::Function *getOriginalFunction();
void setMutatedFunction(llvm::Function *function);
const SourceLocation &getSourceLocation() const;
void addReachableTest(Test *test, int distance);
void applyMutation();
const std::vector<std::pair<Test *, int>> &getReachableTests() const;
std::string getUniqueIdentifier();
std::string getUniqueIdentifier() const;
std::string getMutatorIdentifier() const;
const std::string &getDiagnostics();
const std::string &getDiagnostics() const;
const std::string &getReplacement();
std::string getTrampolineName();
std::string getMutatedFunctionName();
std::string getOriginalFunctionName() const;
std::string dump() const;
std::string dumpSourceCodeContext() const;
}; And no "Mutant.h" Instead of "X", what is the more fitted? "mutation" or "mutant"? Do you think many people have a strong opinion about it and more importantly does it impact the understanding the the above. Out of curiosity I submitted that part of the nomenclature to the mutation testing slack to see some other opinions. Will report back... @maks-rafalko just so we move forward a bit: are we disagreeing only on the mutant/mutation definition or are we also disagreeing in merging the two classes into one? (If that's the case could you expand on to why?) |
Reporting on the poll: so far the following projects agree on that piece of the nomenclature:
|
This is quite a big PR, I am aware, but the changes are not that big :)
Context
I think from almost day one I was a bit confused at
Mutation
&Mutant
, without really being able to put my finger on it. Looking at other mutation testing tools did not clear up that confusion. At the end of the day, my conclusion was that there is two possible interpretation/usage:A look at Infection
In infection, we have
Mutation
and I think the term and usage is clear. I'm however less convinced ofMutant
for the following reasons:Mutant
is constructed, it is butMutation
+ some calculated stateMutant
is constructed,Mutation
needs to keep a decent chunk of state only to be consumed to create aMutant
, nothing elseMutation
andMutant
without really anything in the middle that could justify a separationChange
This PR proposes to merge
Mutant
intoMutation
, the results are:MutantFactory
becomesMutationFactory
and is now injected intoFileMutationGenerator
(which previous was just doingnew Mutation(...)
)Mutation
sees some extra properties fromMutant
but also gets a chunk of properties removed, since were used only to create the calculated fieldsMutantCodeFactory
which has been renamed toMutationCodeFactory
no longer needs to take a wholeMutation
(which is a big VO!) and just requires a few fields nowConcerns
A summary of the concerns/questions raised:
What about high order mutants?
For context, there is not much work that has been done on high order mutants and I am not aware of any mutation testing tool that has successfully implemented it.
So I would say: keeping an abstraction for a concept we are not sure of and don't have any idea what to do with is not a good idea. Plus if we need changes, I don't think my changes will really get in the way :)
Renaming issue: are the names appropriate?
I think the term "mutant", in our code, corresponds more to the started
MutantProcess#process
rather than our current VOMutant
. I did the choice to mergeMutant
intoMutation
, but if you thinkMutation
should now beMutant
, I'm happy to change that as well.I do not think the documentation needs to be changed more than that, a lot of not all is a context where the terms are interchangeable.
So we no longer have mutants?
As mentioned above: we still have them, it's
MutantProcess#process
. It just means it's not an aggregate/VO in our codebase.Misc
Last but least: does this PR helps? Honestly apart from
MutantFactory
which required some non-trivial changes, objectively, this PR was really simple and I believe the tests are also simpler as a result. So IMO it's a good sign as well.