Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quite long compilation time #2170

Closed
wangfenjin opened this issue Jul 26, 2022 · 9 comments
Closed

Quite long compilation time #2170

wangfenjin opened this issue Jul 26, 2022 · 9 comments
Labels
question Further information is requested

Comments

@wangfenjin
Copy link
Contributor

Seems the arrow crate is slow to compile, do we have some way to optimize it? Thanks.

Please rename to html when open the file, generated by cargo build --no-default-features --timings
cargo-timing-20220726T055116Z.html.pdf

@wangfenjin wangfenjin added the question Further information is requested label Jul 26, 2022
@tustvold
Copy link
Contributor

Possibly related #1858

This is something I would be very interested in improving, but I've not had time to really sit down with this yet.

Two rather crude approaches could be to split up the crate to allow greater build parallelism, or to add more features to allow people to opt-out of functionality.

Similarly messing with increasing the codegen-units might help, although potentially at the cost of less optimal builds.

The major bottleneck last time I looked was linking, and LLVM codegen, which makes it hard to optimise without dropping functionality...

@wangfenjin
Copy link
Contributor Author

Could you point the direction about linking and codegen? I may take a look, but don’t know where to start 😂

@tustvold
Copy link
Contributor

https://blog.rust-lang.org/inside-rust/2020/02/25/intro-rustc-self-profile.html might help get you started, it describes how to run the self-profiler to see where the compilation process is spending its time

@wangfenjin
Copy link
Contributor Author

I tried the tools, here is the output from summarize tool, the LLVM_module_codegen_emit_obj takes most of the time.

+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| Item                                             | Self time | % of total time | Time     | Item count | Incremental result hashing time |
+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| LLVM_module_codegen_emit_obj                     | 88.00s    | 50.101          | 88.00s   | 256        | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| LLVM_passes                                      | 20.61s    | 11.734          | 20.61s   | 1          | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| self_profile_alloc_query_strings                 | 16.32s    | 9.294           | 16.33s   | 1          | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| codegen_module                                   | 15.73s    | 8.955           | 20.98s   | 256        | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+

Honestly I totally have no idea what's it about. And the file generated by crox is too large to processed by my mac...

@tustvold
Copy link
Contributor

Given the bottleneck is codegen, if one makes the assumption that the execution time is somewhat proportional to the amount of generated LLVM IR, using a tool like cargo llvm-lines as done in #1858 is probably a good starting point to identify areas of improvement.

@psvri
Copy link
Contributor

psvri commented Aug 18, 2022

Hello,

@tustvold @wangfenjin for linux, rui314/mold is giving very good improvements.

With mold the full builds give only about 10-20% improvements , but the incremental builds are completed in half the time.

@tustvold
Copy link
Contributor

The recent changes to add more dictionary array comparison kernels have really hurt this... I might need to take some time to look into this 🤔

@tustvold
Copy link
Contributor

Update on this, I've filed a couple of PRs to cut out a fair amount of compile time

And the work on splitting up the arrow-crate (#2594) is on going. This should achieve a couple of things:

  • Improve build parallelism
  • Allow taking lighter weight dependencies
  • Make it clearer where compilation time is being spent (primarily the comparison, cast and arithmetic kernels)

@tustvold
Copy link
Contributor

At least on my computer a full release compilation now takes just under 20 seconds, so I think we can consider this issue closed for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants