Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

analyze: new implementation of MIR-to-HIR rewrite lifting #934

Merged
merged 38 commits into from
Jun 12, 2023

Conversation

spernsteiner
Copy link
Contributor

This replaces rewrite::expr::hir_op with three new modules:

rewrite::expr::unlower: Builds the "unlowering map", which maps a MIR "precise location" (a Location plus a Vec<SubLoc> path) to information about the HIR that was lowered to produce that piece of the MIR. Specifically, it records the HirId of the HIR Expr that was lowered along with a description of how the MIR relates to the HIR. For example:

For example:

fn f(a: i32) -> i32 {
    a + 1
}

fn g(x: i32, y: i32) -> i32 {
    x + f(y)
}

For f, the unlowering map annotates the MIR as follows:

block bb0:
  bb0[0]: StorageLive(_2)
  bb0[1]: _2 = _1
    []: StoreIntoLocal, `a`
    [Rvalue]: Expr, `a`
  bb0[2]: _0 = Add(move _2, const 1_i32)
    []: StoreIntoLocal, `a + 1`
    [Rvalue]: Expr, `a + 1`
  bb0[3]: StorageDead(_2)
  bb0[4]: Terminator { source_info: ..., kind: return }

The statement _2 = _1 is associated with the expression a; the statement as a whole is storing the result of evaluating a into a MIR local, and the statement's rvalue _1 represents the expression a itself. Similarly, _0 = Add(move _2, const 1) stores the result of a + 1 into a local. If needed, we could extend the unlower pass to also record that move _2 (a.k.a. bb0[2] [Rvalue, RvalueOperand(0)]) is lowered from the Expr a.

On g, the unlowering map includes the following (among other entries):

bb0[5]: Terminator { source_info: ..., kind: _4 = f(move _5) -> [return: bb1, unwind: bb2] }
  []: StoreIntoLocal, `f(y)`
  [Rvalue]: Expr, `f(y)`
  [Rvalue, CallArg(0)]: Expr, `y`
bb1[1]: _0 = Add(move _3, move _4)
  []: StoreIntoLocal, `x + f(y)`
  [Rvalue]: Expr, `x + f(y)`

The call terminator _4 = f(move _5) computes f(y) and stores the result into a local; its rvalue is f(y) itself, and the first argument of the rvalue is y.

rewrite::expr::distribute: Given the MIR rewrites produced by mir_op and the unlowering map from unlower, this module distributes MIR rewrites to HIR nodes (identified by their HirIds). This step also checks for ambiguous case, such as multiple rewrites from different MIR locations being mapped to the same HIR node, where it's unclear in which order to apply the rewrites, and also detects MIR rewrites that couldn't be mapped to any HIR node.

Using the example from above:

bb0[5]: Terminator { source_info: ..., kind: _4 = f(move _5) -> [return: bb1, unwind: bb2] }
  []: StoreIntoLocal, `f(y)`
  [Rvalue]: Expr, `f(y)`
  [Rvalue, CallArg(0)]: Expr, `y`

A MIR rewrite on bb0[5] [Rvalue, CallArg(0)] would be attached to the MIR
Expr y, and a rewrite on bb0[5] [Rvalue] would be attached to f(y).
A MIR rewrite on bb0[5] [] (i.e. on the call terminator itself) would
result in an error, since there is no good place in the HIR to attach such a
rewrite.

rewrite::expr::convert: Converts the MirRewrites for each HIR Expr into Span-based rewrite::Rewrites. These are returned as the final result of expr::gen_expr_rewrites and are later applied to the input source code using rewrite::apply.


Overall, this branch simplifies the handling of SubLocs; distribute treats Vec<SubLoc> paths as mostly opaque (it expects each MIR rewrite's sub_loc field to exactly match an entry in the unlowering map), and convert doesn't deal with SubLocs at all. More precise SubLoc handling, which will be needed for some tricky pointer-to-pointer cases, goes in the unlower module, whose sole purpose is establishing the mapping between SubLocs and HIR. And unlower makes the mapping between MIR and HIR explicit (unlike hir_op, where it was computed implicitly while building rewrites), so it can be inspected for easier debugging.

All test cases produce the same rewrites as before this branch, except for one place in lighttpd-minimal, where we now place a Cell::from_mut cast in a more appropriate location.

Copy link
Contributor

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could d875e31 be split into a separate PR? It seems mostly separate and simpler than the rest of the PR.

@spernsteiner spernsteiner force-pushed the analyze-unlower branch 3 times, most recently from dbb30b8 to b3c59ae Compare May 30, 2023 22:16
spernsteiner and others added 22 commits June 9, 2023 16:54
Co-authored-by: Khyber Sen <kkysen@gmail.com>
Co-authored-by: Khyber Sen <kkysen@gmail.com>
Co-authored-by: David Anekstein <da@immunant.com>
@kkysen
Copy link
Contributor

kkysen commented Jun 12, 2023

Did you test string_casts.rs on #934 alone, or on a merge of #934 and #839? I've rebased this branch (#934) onto master (which now includes #839) and I haven't seen any test failures.

I merged #934 (this one) into #839 and tested it. If rebase #934 on master, which now includes #839, works, then that's great and there's no issue at all.

@spernsteiner spernsteiner merged commit 62a077a into master Jun 12, 2023
9 checks passed
kkysen added a commit that referenced this pull request Jun 12, 2023
…tring literals in `string_casts.rs` and `lighttpd-minimal`.
kkysen added a commit that referenced this pull request Jun 12, 2023
…iled string literals in `string_casts.rs` and `lighttpd-minimal`.
@kkysen kkysen linked an issue Jun 12, 2023 that may be closed by this pull request
kkysen added a commit that referenced this pull request Jun 12, 2023
…955)

With #839 (adding string cast support) and #934 (fixing #909) now
merged, we can now enable transpiled string literals in
`string_casts.rs` and `lighttpd-minimal`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

(c2rust-analyze) Support rewrites with implicit &raws
3 participants