cranelift/x64: Optimize i128 comparisons #8427
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Inequality comparisons between i128 values were previously eight instructions and this reduces them to two, plus one move if one of the inputs is still live afterward.
Equality comparisons were six instructions and are now three, plus up to two moves if both inputs are still live afterward.
This removes 45 instructions from the test in x64/i128.clif that generates all possible i128 comparisons. In addition to using fewer instructions for each comparison, it also reduces register pressure enough that the function no longer spills.
Conditional branches on i128 values are a special case but similar optimizations shrink them from six instructions to two.
This brings Cranelift in line with what rustc+LLVM generates for equivalent 128-bit comparisons.
This PR probably conflicts with #8421 in the filetest output; I'll rebase whichever one doesn't land first.