Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Vector64.Create for constants #101662

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Apr 28, 2024

Closes #101661

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 28, 2024
@EgorBo
Copy link
Member Author

EgorBo commented Apr 29, 2024

PTAL @kunalspathak @tannergooding cc @dotnet/jit-contrib

Empy size diffs, 283 contexts with text diffs (becuase size is the same)

@@ -2406,14 +2406,36 @@ void CodeGen::genSetRegToConst(regNumber targetReg, var_types targetType, GenTre
}
else
{
// Get a temp integer register to compute long address.
regNumber addrReg = tree->GetSingleTempReg();
simd8_t val = vecCon->gtSimd8Val;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same handling should presumably be added for TYP_SIMD16 as well, yes?

It should also be fine to expose for TYP_SIMD12, since any operations are already required to mask out the unused bit when its impactful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wonder if this is missing some cases or potentially conflicting with the IsValidConstForMovImm in LowerHWIntrinsicCreate handling

That check is supposed to recognize and catch this case, lowering it instead to Arm64.DuplicateToVectorXXX, which is itself supposed to emit the more optimized mov instruction

It might be a case where we need both since sometimes we'll have Create(...) and sometimes we'll have CNS_VEC isntead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good point, I'll check what we generate for other VectorX.Create

Copy link
Member

@tannergooding tannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes as is look correct, there's just likely more we could do here to ensure other cases are handled

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ARM64: suboptimal codegen for Vector64.Create(1)
2 participants