Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor performance tweaks #8048

Closed
wants to merge 2 commits into from

Conversation

rrosenblum
Copy link
Contributor

This is somewhat related to #8022. I decided to briefly look over some of the potential performance issues. I don't think anything in here is going to radically improve performance. It mostly seemed like low hanging fruit to avoid an extra allocation.

@marcandre
Copy link
Contributor

Sorry I haven't had time to plugin some measuring tools. Without that, it's difficult to see how good a change is. Do you know the first rule of optimisation? 😆

In your PR, for example, I bet you a beer that your changes of children.each to each_child_node will make things slower.

Of your porposed changes, I like the one to cop/variable_force.rb because it's a good use of each_children_node. I'm afraid I'm less convinced that the others are actual improvements (speed or legibility).

If you want to help with #8022 it would be awesome. I'll try to get a basic tool for measurement, but there's already a benchmarking task you can use, since you know which cops to attack. Those cops should get at least an order of magnitude faster. Many of them seem to rely on analyzing the tokens for whitespace and all...

@rrosenblum
Copy link
Contributor Author

I have not tested these changes with any kind of profiler, so it is possible that they don't make any kind of impact. I've made changes similar to that in cop/variable_force.rb to help with performance in the past. Same with converting chained looping operators (map, select, reject, each) to use each_with_object. As the JIT compiler is improved, I'm not sure that those kinds of changes will be necessary in the future.

What is your reasoning for believing changing children.each to each_child_node will slow things down?

In the past, I have relied on memory_profiler to help with fixing performance issues

@marcandre
Copy link
Contributor

What is your reasoning for believing changing children.each to each_child_node will slow things down?

First, each_child_node does something additional which is to check that a child is a node before yielding it. That testing is not done with children.each so it's not quite possible to do more and yet be faster.

But even assuming there was no additional checking, it's not possible for each_child_node to be faster because it has to, whatever implementation it has, call children itself (or access @children) and must call each (or some other iterator which would be slower). So it's at best exactly the same, probably a tiny bit slower (say one more method call).

That's the very best case scenario, which would requires the implementation to use &block and Ruby 2.5+; if it calls yield it will be slower as builtin methods (like Array#each here) yield faster than Ruby methods. If it checks if it needs to return an enumerator if a block is given (as it should and it does), that's more processing, etc.

You just can't beat an attribute call (super optimized) + builtin Array#each (super optimized).

Even if you avoided creating an intermediate array (which you can't here), it is typically not faster. It's a bit frustrating, but yielding from Ruby code is actually a bit slow. When optimizing Parser's TreeRewriter I measured that it was faster to return a temporary array of things and act on them then yielding them and avoid creating the intermediate array (few thousand elements IIRC, test in Ruby 2.7) 🤷‍♂️.

As for any talk about optimization: this is theoretical, as unless someone changes the O of an algorithm (say like O(n) vs O(1)), or someone measures that there's actually a bottleneck at some point, it's typically not worth trying to optimize it.

The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times
-- Knuth

@rrosenblum
Copy link
Contributor Author

rrosenblum commented Jun 7, 2020

Thanks for providing the feedback. There were some aspects of the change that I had not considered before. You are absolutely correct about node.children.each being better than node.each_child_node.

It looks like there are several places in the code that could be changed to replace each_child_node. I doubt there would be much noticeable improvement.

There are a couple of small changes in here that I think still could be good. Overall, they are rather small. I still need to actually test the performance diff.

@rrosenblum rrosenblum closed this Jul 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants