Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove lto from the release profile #2401

Merged
merged 1 commit into from
Jun 17, 2020

Conversation

KyleSiefring
Copy link
Collaborator

The performance difference has shrunk substantially adding inline to a
bunch of functions. The performance difference with or without lto is
about 4 seconds on the slowest clip/qp on a standard awcy run
(MINECRAFT, objective-1-fast, qp 80). The compile time difference is 42
seconds.

If it's possible to have good performance with lto off, then optimizing
without it will provide a good development profile. Going forward, it
would be advisable to use lto to check for performancem problems. It
may be possible to create a seperate profile between dev and release.
AWCY would need to be modified to use this and also work with old
versions of rav1e.

@coveralls
Copy link
Collaborator

coveralls commented Jun 16, 2020

Coverage Status

Coverage increased (+0.005%) to 81.983% when pulling a057c0a on KyleSiefring:cancel_lto_review into b14cbc7 on xiph:master.

@lu-zero
Copy link
Collaborator

lu-zero commented Jun 17, 2020

Please mention rust-lang/cargo#6988

@lu-zero
Copy link
Collaborator

lu-zero commented Jun 17, 2020

Using rustc 1.44.0:

On x86_64 (macos):
with lto:

real	3m52.024s
user	9m18.176s
sys	0m17.777s

w/out lto:

real	3m0.522s
user	8m21.834s
sys	0m15.617s

On aarch64(linux):
with lto:

real    7m11.829s
user    13m57.967s
sys     0m15.912s

w/out lto:

real    5m47.165s
user    12m34.384s
sys     0m15.241s

I didn't run a benchmark yet

@lu-zero
Copy link
Collaborator

lu-zero commented Jun 17, 2020

On Arm lto is about 2% faster now.

Copy link
Collaborator

@lu-zero lu-zero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a batch of benchmarks lto gives around 2% speedup for a 25% increase in compile times.
It can be removed from the defaults now.

The performance difference has shrunk substantially adding inline to a
bunch of functions. The performance difference with or without lto is
about 4 seconds on the slowest clip/qp on a standard awcy run
(MINECRAFT, objective-1-fast, qp 80). The compile time difference is 42
seconds.

If it's possible to have good performance with lto off, then optimizing
without it will provide a good development profile. Going forward, it
would be advisable to use lto to check for performancem problems. It
may be possible to create a seperate profile between dev and release.
Would require rust-lang/cargo#6988 and AWCY
would need to be modified to use the new profile/still work with old
versions of rav1e.
@KyleSiefring KyleSiefring merged commit 35e904e into xiph:master Jun 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants