Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support mcdc analysis for pattern matching #124278

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

ZhuUx
Copy link
Contributor

@ZhuUx ZhuUx commented Apr 23, 2024

To finish the second task at #124144.

Changes

  • Implement branch coverage for pattern matching specific to mcdc. As suggested at #124217, mcdc would better implement its own branch coverage methods for pattern match.
    That means, coverage-options=mcdc may give different branch coverage results of pattern matching with coverage-options=branch temporarily.
  • Generate decision mappings for all refutable patterns in match.

Note. The results might be a bit counter intuitive due to reorder as this comment introduces.

@rustbot
Copy link
Collaborator

rustbot commented Apr 23, 2024

r? @wesleywiser

rustbot has assigned @wesleywiser.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 23, 2024
@ZhuUx
Copy link
Contributor Author

ZhuUx commented Apr 23, 2024

For now there are still some work need to do and some code is not well optimized to reduce work in merging #124255 .
r? @ghost
@rustbot label +A-code-coverage
@rustbot author

@rustbot
Copy link
Collaborator

rustbot commented Apr 23, 2024

Failed to set assignee to ghost: invalid assignee

Note: Only org members with at least the repository "read" role, users with write permissions, or people who have commented on the PR may be assigned.

@rustbot rustbot added A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 23, 2024
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@ZhuUx ZhuUx force-pushed the pattern-match branch 2 times, most recently from 39732bf to 0255b0b Compare April 24, 2024 02:39
@Lambdaris
Copy link

Lambdaris commented Apr 24, 2024

Determining decision of pattern matching is more tricky than expected. I should write it down here in case someone feels confused about it.
I'm not sure if rustc promises anything about the order of checked pattern (probably not), so the description here should be only taken as a specific way determined by the implementation of compiler when this pr is drafting.

We call patterns like ( A | B, C (X | Y) ) as candidates, each candidate has match pairs which indicates sub patterns and also has their "sub pairs". For example, candidate for ( A | B, C (X | Y) ) has match pairs for A | B (let's call it M1) and C (X | Y) (M2) separately. Also M1 has sub pairs for A and B and M2 has sub pair X | Y.

The first rule is that candidate is constructed where all match pairs representing | pattern (directly) are moved to last
This is done at simplify_match_pairs.
For the example compiler checks C first because A | B is or pattern while C is not directly (though it has an or sub pattern).

The second rule is the first pair will be removed and compiler will insert its sub pairs into the candidate's match pairs if it is "full matched"
This is done at sort_candidate.
So when compiler checks C first, it removes it and takes match pair representing X | Y as the candidate's last match pair. If the sub pattern in C were not |, it would be inserted at front of all or patterns as the code suggests.

Thus in all, the evaluate order of if let ( A | B, C (X | Y) ) = val is:

  1. Check whether val.1 is C,
  2. Check whether val.0 is A | B,
  3. Check whether val.1.0 is X | Y.

While the order of if let (A | B, C(X)) = val is:

  1. Check whether val.1 is C,
  2. Check whether val.1.0 is X,
  3. Check whether val.0 is A | B.

And the order of if let (A(Z), C) = val is:

  1. Check whether val.0 is A,
  2. Check whether val.0.0 is Z,
  3. Check whether val.1 is C.

A more mischievous example is if let (A | B, C (X | Y, Z), D) = val, its control flow would be like:

  1. Check whether val.1 is C,
  2. Check whether val.2 is D,
  3. Check whether val.1.1 is Z,
  4. Check whether val.0 is A | B,
  5. Check whether val.1.0 is X | Y.

Hopefully the decision structure were not taken as mischief by users.

@ZhuUx ZhuUx changed the title Support mcdc analysis for pattern match Support mcdc analysis for pattern matching Apr 24, 2024
@rust-log-analyzer

This comment has been minimized.

@ZhuUx
Copy link
Contributor Author

ZhuUx commented Apr 28, 2024

I have pushed an impractical commit to show design for constructing decisions and mcdc branches of pattern matching.
Mostly is in MCDCDecisionBuilder. I'd like to finish it after merging #124255 and #124399 because the way to inject block marker should be changed.
But still any comments about it are welcomed.

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Apr 29, 2024

☔ The latest upstream changes (presumably #124255) made this pull request unmergeable. Please resolve the merge conflicts.

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 30, 2024
Split mcdc code to a sub module of coverageinfo

A further work from rust-lang#124217 . I have made relatively large changes when working on rust-lang#124278 so that it would better split them from `coverageinfo.rs` to avoid potential troubling merge work with improved branch coverage by `@Zalathar` .

Besides `BlockMarkerGenerator` is added to avoid ownership problems (mostly needed for following change of rust-lang#124278 )

All code changes are done in [a37d737a](rust-lang@a3d737a) while the second commit just renames the file.

cc `@RenjiSann` `@Zalathar`
This will impact your current work.
@ZhuUx
Copy link
Contributor Author

ZhuUx commented May 10, 2024

Implementation for if-let has been drafted. Due to llvm does not support nested decisions yet tests for let-chains are not added. I should try to reduce coverage expressions and investigate if it works for matching guards later.

@bors
Copy link
Contributor

bors commented May 10, 2024

☔ The latest upstream changes (presumably #124972) made this pull request unmergeable. Please resolve the merge conflicts.

@ZhuUx ZhuUx force-pushed the pattern-match branch 2 times, most recently from 17b9e82 to baefce9 Compare May 15, 2024 02:08
@ZhuUx ZhuUx force-pushed the pattern-match branch 2 times, most recently from 05534ce to 386f870 Compare May 16, 2024 03:16
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request May 20, 2024
coverage: Memoize and simplify counter expressions

When creating coverage counter expressions as part of coverage instrumentation, we often end up creating obviously-redundant expressions like `c1 + (c0 - c1)`, which is equivalent to just `c0`.

To avoid doing so, this PR checks when we would create an expression matching one of 5 patterns, and uses the simplified form instead:
- `(a - b) + b` → `a`.
- `(a + b) - b` → `a`.
- `(a + b) - a` → `b`.
- `a + (b - a)` → `b`.
- `a - (a - b)` → `b`.

Of all the different ways to combine 3 operands and 2 operators, these are the patterns that allow simplification.

(Some of those patterns currently don't occur in practice, but are included anyway for completeness, to avoid having to add them later as branch coverage and MC/DC coverage support expands.)

---

This PR also adds memoization for newly-created (or newly-simplified) counter expressions, to avoid creating duplicates.

This currently makes no difference to the final mappings, but is expected to be useful for MC/DC coverage of match expressions, as proposed by rust-lang#124278 (comment).
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request May 20, 2024
Rollup merge of rust-lang#125106 - Zalathar:expressions, r=davidtwco

coverage: Memoize and simplify counter expressions

When creating coverage counter expressions as part of coverage instrumentation, we often end up creating obviously-redundant expressions like `c1 + (c0 - c1)`, which is equivalent to just `c0`.

To avoid doing so, this PR checks when we would create an expression matching one of 5 patterns, and uses the simplified form instead:
- `(a - b) + b` → `a`.
- `(a + b) - b` → `a`.
- `(a + b) - a` → `b`.
- `a + (b - a)` → `b`.
- `a - (a - b)` → `b`.

Of all the different ways to combine 3 operands and 2 operators, these are the patterns that allow simplification.

(Some of those patterns currently don't occur in practice, but are included anyway for completeness, to avoid having to add them later as branch coverage and MC/DC coverage support expands.)

---

This PR also adds memoization for newly-created (or newly-simplified) counter expressions, to avoid creating duplicates.

This currently makes no difference to the final mappings, but is expected to be useful for MC/DC coverage of match expressions, as proposed by rust-lang#124278 (comment).
@bors
Copy link
Contributor

bors commented May 20, 2024

☔ The latest upstream changes (presumably #125331) made this pull request unmergeable. Please resolve the merge conflicts.

@ZhuUx ZhuUx force-pushed the pattern-match branch 2 times, most recently from 659eede to 7e0374f Compare May 21, 2024 08:16
@ZhuUx
Copy link
Contributor Author

ZhuUx commented May 21, 2024

Thanks to @Nadrieril now I could find a way to treat each arm in match statements as a decision.

The basic idea is to update test vectors of every decision in every arms. For instance,

match val {
    Pat::A | Pat::B => { /* update test vectors of `Pat::A | Pat::B` and `Pat::C | Pat::D` here */}
    Pat::C | Pat::D=> { /* update test vectors of `Pat::A | Pat::B`  and `Pat::C | Pat::D` here */}
    _ => { /* update test vectors of `Pat::A | Pat::B`  and `Pat::C | Pat::D` here */}
}

By updating test vectors of all decisions simultaneously we can deal with tests affecting different decisions. Fortunately we have implemented decision depth to such cases.

normal_branch_spans: Vec<MCDCBranchSpan>,
mcdc_spans: Vec<(MCDCDecisionSpan, Vec<MCDCBranchSpan>)>,
decision_ctx_stack: Vec<DecisionCtx>,
decision_ends: FxIndexMap<Span, Rc<RefCell<Vec<BlockMarkerId>>>>,
Copy link
Contributor Author

@ZhuUx ZhuUx May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decision_ends is introduced because:

  1. Candidates with | may not have pre_binding_block when matching finishes.
  2. For every arm in match statements, arm_block is generated in lower_match_arms and immediately compiler digs into code in the arm, which might contain many other decisions.

Hence when a PatternDecisionCtx is finished, we do not know all blocks which are "ends" of these decisions. Instead we fulfill them in add_ends_to_decision later. Because decisions produced by same ctx share same ends, Rc<RefCell<_>> is used here.

flip1995 pushed a commit to flip1995/rust-clippy that referenced this pull request May 24, 2024
coverage: Memoize and simplify counter expressions

When creating coverage counter expressions as part of coverage instrumentation, we often end up creating obviously-redundant expressions like `c1 + (c0 - c1)`, which is equivalent to just `c0`.

To avoid doing so, this PR checks when we would create an expression matching one of 5 patterns, and uses the simplified form instead:
- `(a - b) + b` → `a`.
- `(a + b) - b` → `a`.
- `(a + b) - a` → `b`.
- `a + (b - a)` → `b`.
- `a - (a - b)` → `b`.

Of all the different ways to combine 3 operands and 2 operators, these are the patterns that allow simplification.

(Some of those patterns currently don't occur in practice, but are included anyway for completeness, to avoid having to add them later as branch coverage and MC/DC coverage support expands.)

---

This PR also adds memoization for newly-created (or newly-simplified) counter expressions, to avoid creating duplicates.

This currently makes no difference to the final mappings, but is expected to be useful for MC/DC coverage of match expressions, as proposed by rust-lang/rust#124278 (comment).
@ZhuUx
Copy link
Contributor Author

ZhuUx commented May 28, 2024

Changed a bit to fit with constant folding. The main goal is to find all false blocks of branches and let the false term be sum of all counters in false blocks.
Now it uses two variants of MCDCBranchMarker to represent markers for boolean expressions and pattern matching respectively. The difference on false blocks between boolean expressions and pattern matching is the true blocks and false blocks of boolean expressions possibly does not share same predecessor in nested decisions. While for pattern matching, false block might not exist when building mir (such false blocks are inserted when instrumenting coverage statements for bcb edges). Hence it's hard to find a unified way to get false blocks for boolean expressions and patterns.
See comparison for details.

@ZhuUx
Copy link
Contributor Author

ZhuUx commented May 30, 2024

Upon compiling my projects with "--coverage-options=mcdc", I have found there are still some work to do with pattern matching expanded by macros. I would like to investigate whether we should refactor this implementation for macros.
@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants