Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Mutant #1209

Open
wants to merge 27 commits into
base: master
Choose a base branch
from
Open

Conversation

theofidry
Copy link
Member

@theofidry theofidry commented Mar 24, 2020

This is quite a big PR, I am aware, but the changes are not that big :)

Context

I think from almost day one I was a bit confused at Mutation & Mutant, without really being able to put my finger on it. Looking at other mutation testing tools did not clear up that confusion. At the end of the day, my conclusion was that there is two possible interpretation/usage:

  • "mutation" and "mutant" are synonyms in the context of mutation testing or at least there is quite a bit of context where the terms are interchangeable.
  • the two terms designate different things: one [mutation] designate the change that can be done whilst the other [mutant] designates the mutated_program, i.e. where the mutation has been applied

A look at Infection

In infection, we have Mutation and I think the term and usage is clear. I'm however less convinced of Mutant for the following reasons:

  • The way Mutant is constructed, it is but Mutation + some calculated state
  • Due to the way Mutant is constructed, Mutation needs to keep a decent chunk of state only to be consumed to create a Mutant, nothing else
  • We currently have a 1:1 relationship between Mutation and Mutant without really anything in the middle that could justify a separation

Change

This PR proposes to merge Mutant into Mutation, the results are:

  • MutantFactory becomes MutationFactory and is now injected into FileMutationGenerator (which previous was just doing new Mutation(...))
  • Mutation sees some extra properties from Mutant but also gets a chunk of properties removed, since were used only to create the calculated fields
  • MutantCodeFactory which has been renamed to MutationCodeFactory no longer needs to take a whole Mutation (which is a big VO!) and just requires a few fields now

Concerns

A summary of the concerns/questions raised:

What about high order mutants?

For context, there is not much work that has been done on high order mutants and I am not aware of any mutation testing tool that has successfully implemented it.

So I would say: keeping an abstraction for a concept we are not sure of and don't have any idea what to do with is not a good idea. Plus if we need changes, I don't think my changes will really get in the way :)

Renaming issue: are the names appropriate?

I think the term "mutant", in our code, corresponds more to the started MutantProcess#process rather than our current VO Mutant. I did the choice to merge Mutant into Mutation, but if you think Mutation should now be Mutant, I'm happy to change that as well.

I do not think the documentation needs to be changed more than that, a lot of not all is a context where the terms are interchangeable.

So we no longer have mutants?

As mentioned above: we still have them, it's MutantProcess#process. It just means it's not an aggregate/VO in our codebase.

Misc

Last but least: does this PR helps? Honestly apart from MutantFactory which required some non-trivial changes, objectively, this PR was really simple and I believe the tests are also simpler as a result. So IMO it's a good sign as well.

src/Engine.php Outdated Show resolved Hide resolved
src/Mutation/Mutation.php Outdated Show resolved Hide resolved
@theofidry theofidry changed the title Refactor/mutant less Remove Mutant Mar 25, 2020
@sanmai
Copy link
Member

sanmai commented Mar 26, 2020

I'm not so sure MutationCalculatedState is a way better than a Mutant. It is somewhat like calling a doorknob a handle-you-pull-to-open-a-door.

@theofidry
Copy link
Member Author

It is somewhat like calling a doorknob a handle-you-pull-to-open-a-door.

Still better than calling it a pullable-door :P

I'm happy to take another name. Keep in mind that though that it is strictly internal to Mutation (and by extension MutationFactory) unlike Mutant was

Copy link
Member

@sanmai sanmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lot of changes here. I wish I could review functional changes separately from name changes.

src/Mutation/MutationFactory.php Outdated Show resolved Hide resolved
@theofidry
Copy link
Member Author

That's a lot of changes here. I wish I could review functional changes separately from name changes.

I wanted that as well, but I also needed the final state to judge if was worth it or not. But my recommendation is to checkout locally Mutation, MutationFactory, MutantCodeFactory & MutationTestingRunner: those are the ones that changed. All the others are:

  • renaming
  • using Mutation instead of Mutant

and frankly are quite minimal and uninteresting from a review PoV. It was really useful for me though to judge the final result and get an early feedback about the design

@theofidry
Copy link
Member Author

Update as per @sanmai comments, this PR has changed.

You can see it with the following rationale: Mutant = Mutation + some calculated state, but since at no point we skip any Mutation, we have a 1:1 relationship between Mutant & Mutation. As a result it makes sense to just merge both and avoid any confusion.

I hope this makes the reasoning a bit easier, but the PR needs no more tweaking for being reviewed :)

@maks-rafalko
Copy link
Member

maks-rafalko commented Mar 29, 2020

I'm afraid I don't understand this renaming. Every resource about mutation testing have a definition of Mutant, which is now just removed.

A couple of well known ones:

  • Equivalent Mutant
  • High Order Mutant
  • Second Order Mutant
  • etc.

Also

So, merging these 2 different things makes the things even more confusing IMO (these are my thought not looking into the code yet).

Also, didn't see any mentions of Mutagenesis before in any of the "official" docs. Where does it come from?


Another concern is about HOM (Higher Order Mutations).

Higher order mutations are the combinations of two or more single mutations.

E.g. a second order mutation is the combination of two mutations, a third order mutation is the combination of three mutations.
[...]
Most cases, HOM just creates a new mutant [...]

So, HOM is a combination of several mutations (N) which results in 1 Mutant. N:1 relations, not 1:1 which conflicts with your statement:

... we have a 1:1 relationship between Mutant & Mutation


I still don't agree with the changes from #1207 that are presented here as well.

Didn't have enough time to learn the code (though spent 20 minutes already), but at this point I'm -1 on these renaming.

@theofidry
Copy link
Member Author

I'm afraid I don't understand this renaming. Every resource about mutation testing have a definition of Mutant, which is now just removed.

Because there it implies an applied changed or at best is a context where mutation & mutant are interchangeable.

Also, didn't see any mentions of Mutagenesis before in any of the "official" docs. Where does it come from?

https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46584.pdf

So, merging these 2 different things makes the things even more confusing IMO (these are my thought not looking into the code yet).

My answer to this statement is:

At the end of the day, my conclusion was: either "mutation" and "mutant" are synonyms in the context of mutation testing, or if we consider them being different, the difference lies in the fact that the mutant designates the mutated program, i.e. where the mutation has been applied, and mutation the transformation to apply (but not yet applied). However, in infection this is not the case, since Mutant is a kind of Mutation with some calculated state.

So, HOM is a combination of several mutations (N) which results in 1 Mutant. N:1 relations, not 1:1 which conflicts with your statement:

I'll check if done in mutant. If not done there, I highly doubt we'll ever have to worry about that.

But supposedly, that would still be perfectly doable by instead of having a class Mutation, an abstraction with PlainMutation and HighOrderMutation for example.

Didn't have enough time to learn the code (though spent 20 minutes already), but at this point I'm -1 on these renaming.

As mentioned, the main change really is Mutation & MutationFactory. The rest is smooth & boring adaptations

@theofidry
Copy link
Member Author

Maybe let's put the problem another way: I don't think the current distinction between Mutation & Mutant in infection has any sense (code-wise) and compares to the more general purpose definitions we are referring to. Hence the proposal to merge it. If you think Mutant would be a better name than Mutation, I'm happy to go for that as well

@theofidry
Copy link
Member Author

@sanmai @maks-rafalko I've updated the description as an attempt to better express myself and address some of the concerns mentioned.

If there is any specific piece (e.g. @maks-rafalko you mentioned the renaming cases from #1207) I'll be happy to address them (in the case of those renaming cases @maks-rafalko I'll revert it as soon as I update that PR).

I would however like a first review/consensus before spending more time updating this PR :)

@theofidry
Copy link
Member Author

Rebased & reverted the namespace changes: so it requires another PR to fix some naming inconsistencies but that also lighten (a bit) the amount of stuff to review and opens up a dedicated space to discuss some naming concerns instead of doing it here.

I'll appreciate another attentive review on it, it was also very time consuming for me to review, revert & rebase :)

@theofidry
Copy link
Member Author

ping @infection/core

@BackEndTea
Copy link
Member

👍 I think this makes the language of the project a bit simpler, as well as the code

Copy link
Member

@maks-rafalko maks-rafalko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I tried to dig into the code one more time, still have difficulties to understand the renaming.

I have a feeling that 2 things are mixed up here which really confuse me:

  • probably the first one is moving needed methods / logic from one place to another, making Mutation & Mutant more clear.
  • the second one is removing Mutant and renaming everything to Mutation which simply conflicts with definitions of Mutation Testing.

I hope I provided a good example in this comment: #1209 (comment)

src/Mutant/MutantExecutionResultFactory.php Outdated Show resolved Hide resolved
);

$mutationFilePath = sprintf(
'%s/mutation.%s.infection.php',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to say I'm lost here.

Even from the nomenclature added in this PR, Mutation is:

The result of applying a mutator to the AST of a subject and represents a change to be applied.

But the result source code after applying a Mutation is called Mutant.

New program that differs from the original by applying a mutation.

IMO this conflicts with definitions from Mutation Testing theory (not even in Infection in particular)


After more digging into this patch, here is what we do in this PR:

$mutatedCode = $codeFactory->createCode();

# later in `MutationTestingRunner.php` we dump it to FS

$filesystem->dumpFile('mutation.%s.infection.php', $mutatedCode)

How it was before (from my POV - much better, note the variables / file names)

$mutatedCode = $mutantCodeFactory->createCode();

# later in `MutationTestingRunner.php` we dump it to FS

$filesystem->dumpFile('mutant.%s.infection.php', $mutatedCode)

Let's also see the name of $mutationFilePath variable. How can Mutation have a file path? From nomenclature, Mutation "... represents a change to be applied", while Mutant is "New program that differs from the original by applying a mutation."

New program can have a file (where this program is located).
Change to be applied - not. (Mutation is "The result of applying a mutator to the AST of a subject").

We store AST in memory, and we dump the result of mutated AST (which is a Mutant) to a file.

Copy link
Member Author

@theofidry theofidry Apr 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really thinking hard on this, so let's see if that makes sense.

The way I see it, the changed AST, a string diff, or a dumped file, is not the mutant, but the mutation.

I interpret "New program that differs from the original by applying a mutation." more as "the process with the mutated code loaded", i.e. in our case, the "process in which we intercept the autoloader to load the mutated code instead of the original".

Maybe this is too much of a stretch? But honestly this makes more sense to me and also has another merit: it is not specific to PHP/Infection. Dumping the mutated code to the FS is an Infection specificity due to how PHP works. Calling that [the dumped file] a mutant does not sound correct to me

@@ -81,19 +81,19 @@ protected function setUp(): void
$this->fail('Input code cant be the same as mutated code');
}

$mutants = $this->mutate($inputCode, $settings);
$mutatedCodes = $this->mutate($inputCode, $settings);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, the result of mutation is Mutant. As for me, the previous variable name is perfectly ok

$this->fileSystemMock
->expects($this->exactly(2))
->method('dumpFile')
->withConsecutive(
['/path/to/mutant0', 'mutated code 0'],
['/path/to/mutant1', 'mutated code 1']
['/path/to/mutation0.php', 'mutated code 0'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here - dumped new mutated code is called Mutant, not Mutation

@@ -66,35 +74,103 @@ public function __construct(
$this->tmpDir = $tmpDir;
$this->differ = $differ;
$this->printer = $printer;
$this->mutantCodeFactory = $mutantCodeFactory;
$this->codeFactory = $mutantCodeFactory;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this renaming is needed. This is exactly mutant code factory

@maks-rafalko
Copy link
Member

'll check if done in mutant. If not done there, I highly doubt we'll ever have to worry about that.

Are you saying that Mutant (ruby lib) is the source of truth for us? I doubt. If there is something is (not) implemented, I have no idea why we need to (not) do the same.

@theofidry
Copy link
Member Author

Are you saying that Mutant (ruby lib) is the source of truth for us? I doubt. If there is something is (not) implemented, I have no idea why we need to (not) do the same.

It's not a source of truth. It is however a reliable source of a more mature version (9yo now?). I also like to take pitest as a reference.

So that mutant does or doesn't do something doesn't translate in a need for us to imitate. It can however provide some insight on a usage. But to more broadly: we have yet to have anything concrete on higher order mutants, so trying to introduce an abstraction for the sole purpose of it at this point would be IMO unwise

@theofidry
Copy link
Member Author

I have a feeling that 2 things are mixed up here which really confuse me:

@maks-rafalko I totally get that feeling :) It's unfortunately a case where I feel both the point of contortion and confusion is about the naming. Because objectively code-wise this PR removes an extra abstraction we have no use of (at the very least for now) and (mostly as a result of that removal) simplifies things

@maks-rafalko
Copy link
Member

maks-rafalko commented Apr 22, 2020

I interpret "New program that differs from the original by applying a mutation." more as "the process with the mutated code loaded", i.e. in our case, the "process in which we intercept the autoloader to load the mutated code instead of the original".

Calling that [the dumped file] a mutant does not sound correct to me

^ this is where we have a conflicted understanding of what the Mutant is.

For example, here is how Wikipedia treat it:

mutant_wiki

Another example from Mutation Testing book:

mutant_book


Here is how Pitest defines it:

By applying the mutation operators PIT will generate a number (potentially a very large number) of mutants. These are Java classes which contain a mutation (or fault) which should make them behave differently from the unmutated class.

@theofidry
Copy link
Member Author

theofidry commented Apr 23, 2020

Pitest

Mutant.java
/**
 * A fully generated mutant
 */
public final class Mutant {

  private final MutationDetails details;
  private final byte[]          bytes;

  public Mutant(final MutationDetails details, final byte[] bytes) {
    this.details = details;
    this.bytes = bytes;
  }

  /**
   * Returns a data relating to the mutant
   *
   * @return A MutationDetails object
   */
  public MutationDetails getDetails() {
    return this.details;
  }

  /**
   * Returns a byte array containing the mutant class
   *
   * @return A byte array
   */
  public byte[] getBytes() {
    return this.bytes;
  }

}

But no Mutation.java

mbj/mutant

mutation.rb
# Represent a mutated node with its subject
  class Mutation
    include AbstractType, Adamantium::Flat
    include Concord::Public.new(:subject, :node)

    CODE_DELIMITER = "\0"
    CODE_RANGE     = (0..4).freeze

    # Identification string
    #
    # @return [String]
    def identification
      "#{self.class::SYMBOL}:#{subject.identification}:#{code}"
    end
    memoize :identification

    # Mutation code
    #
    # @return [String]
    def code
      sha1[CODE_RANGE]
    end
    memoize :code

    # Normalized mutation source
    #
    # @return [String]
    def source
      Unparser.unparse(node)
    end
    memoize :source

    # The monkeypatch to insert the mutation
    #
    # @return [String]
    def monkeypatch
      Unparser.unparse(subject.context.root(node))
    end
    memoize :monkeypatch

    # Normalized original source
    #
    # @return [String]
    def original_source
      subject.source
    end

    # Test if mutation is killed by test reports
    #
    # @param [Result::Test] test_result
    #
    # @return [Boolean]
    def self.success?(test_result)
      self::TEST_PASS_SUCCESS.equal?(test_result.passed)
    end

    # Insert mutated node
    #
    # @param kernel [Kernel]
    #
    # @return [Loader::Result]
    def insert(kernel)
      subject.prepare
      Loader.call(
        binding: TOPLEVEL_BINDING,
        kernel:  kernel,
        source:  monkeypatch,
        subject: subject
      )
    end

  private

    # SHA1 sum of source and subject identification
    #
    # @return [String]
    def sha1
      Digest::SHA1.hexdigest(subject.identification + CODE_DELIMITER + source)
    end
    memoize :sha1

    # Evil mutation that should case mutations to fail tests
    class Evil < self

      SYMBOL            = 'evil'
      TEST_PASS_SUCCESS = false

    end # Evil

    # Neutral mutation that should not cause mutations to fail tests
    class Neutral < self

      SYMBOL            = 'neutral'
      TEST_PASS_SUCCESS = true

    end # Neutral

    # Noop mutation, special case of neutral
    class Noop < Neutral

      SYMBOL            = 'noop'
      TEST_PASS_SUCCESS = true

    end # Noop

  end # Mutation

But no Mutant

Stryker

Mutant.ts
interface Mutant {
  mutatorName: string;
  fileName: string;
  range: Range;
  replacement: string;
}

No Mutation interface.

You also mention two sources from mutation testing book, mine comes from another. So I don't think there is a consensus here and admittedly, if you go to someone and say we have:

$x = $a + $b;

and there is a "X":

$x = $a - $b

Mull

MutationPoint.h
class MutationPoint {
  Mutator *mutator;
  const MutationPointAddress address;
  Bitcode *bitcode;
  llvm::Function *originalFunction;
  llvm::Function *mutatedFunction;
  std::string uniqueIdentifier;
  std::string diagnostics;
  std::string replacement;
  const SourceLocation sourceLocation;
  std::vector<std::pair<Test *, int>> reachableTests;
  irm::IRMutation *irMutator;

public:
  MutationPoint(Mutator *mutator, irm::IRMutation *irMutator, llvm::Instruction *instruction,
                std::string replacement, Bitcode *m, std::string diagnostics);

  ~MutationPoint() = default;

  Mutator *getMutator();
  Mutator *getMutator() const;

  const MutationPointAddress &getAddress() const;
  llvm::Value *getOriginalValue() const;
  Bitcode *getBitcode() const;

  llvm::Function *getOriginalFunction();
  void setMutatedFunction(llvm::Function *function);

  const SourceLocation &getSourceLocation() const;

  void addReachableTest(Test *test, int distance);
  void applyMutation();

  const std::vector<std::pair<Test *, int>> &getReachableTests() const;

  std::string getUniqueIdentifier();
  std::string getUniqueIdentifier() const;

  std::string getMutatorIdentifier() const;

  const std::string &getDiagnostics();
  const std::string &getDiagnostics() const;

  const std::string &getReplacement();

  std::string getTrampolineName();
  std::string getMutatedFunctionName();
  std::string getOriginalFunctionName() const;

  std::string dump() const;
  std::string dumpSourceCodeContext() const;
};

And no "Mutant.h"

Instead of "X", what is the more fitted? "mutation" or "mutant"? Do you think many people have a strong opinion about it and more importantly does it impact the understanding the the above.

Out of curiosity I submitted that part of the nomenclature to the mutation testing slack to see some other opinions. Will report back...

@maks-rafalko just so we move forward a bit: are we disagreeing only on the mutant/mutation definition or are we also disagreeing in merging the two classes into one? (If that's the case could you expand on to why?)

@theofidry
Copy link
Member Author

Reporting on the poll: so far the following projects agree on that piece of the nomenclature:

  • mull
  • mutant
  • mutmut

This was referenced Nov 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants