Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine common code / reduce codegen by factoring out common parts #819

Merged
merged 4 commits into from May 16, 2020

Conversation

bluss
Copy link
Member

@bluss bluss commented May 16, 2020

The goal is to reduce the amount of code (as counted by cargo-llvm-lines) generated for our functions.

  1. Make a few shape check functions less generic by dropping the dependency on the element type (and its size) as early as possible
  2. Factor out the inner loop in the three main Zip apply_core_* methods
  3. Small change to Baseiter::fold that also reduces the size of the loop

bluss added 4 commits May 16, 2020 23:09
Split these shape check functions into parts that don't require using
the element type as a generic parameter. This means that less
instantiations of this code is needed for a typical user of ndarray.
Remove redundant .clone() and simplify inner loop
The original design was that the apply cores were "resumable", and thus
we'd put the dimension "back in order" when exiting the method.

However, this is not in use today (and loop break from FoldWhile also
bypasses this tail code, anyway), so just remove this unused line.
The innermost loop is the stretch of constant stride (for the contiguous
case, this is the whole loop); with a lot of function arguments, this
part is actually commmon to the three loops (contig, c, and f).

Also make a small simplification of this inner loop; use a while loop to
have less code to compile in this much instantiated function.
@bluss bluss merged commit 79d99c7 into master May 16, 2020
@bluss bluss deleted the reduce-generated-code branch May 16, 2020 21:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant