Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grouped query performance of total_count using Arel #1022

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

bendilley
Copy link
Contributor

Improved performance for total_count on grouped ActiveRecord::Relation.

This re-implements #979 using Arel to avoid issues #1012 and #1015

After implementing it this way, it became obvious that Arel really wasn't doing anything special that couldn't safely be achieved with plain SQL, so I'm opening a 2nd PR to proffer the choice...

c
end
# Handle grouping with a subquery
@total_count = if c.group_values.any?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get rid of if and always use the subquery approach? What would be the downside? Is it less efficient?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum... there might be some issues when #total_count is called with a column_name...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what is the use case for calling #total_count with specific column_name, though.

# Handle grouping with a subquery
@total_count = if c.group_values.any?
sq = c.except(:select).select("1 AS record").arel.as("subquery")
c.model.connection.select_value Arel::SelectManager.new.from(sq).project(sq[:record].count)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also use Arel.star.count so it does not depend on alias set for subquery.

Suggested change
c.model.connection.select_value Arel::SelectManager.new.from(sq).project(sq[:record].count)
c.model.connection.select_value Arel::SelectManager.new.from(sq).project(Arel.star.count)

@bendilley
Copy link
Contributor Author

Thank you @guigs for taking the time to look over this PR! I am currently on a different project from the one that's driving this change but will hopefully be able to come and work on this in the next couple of weeks.

This test's data was built using a hash keyed by a name which was
included twice, so it would always pass as long as total_count builds a
valid query. Fixed by using a 2D array instead.
Improved performance for total_count on grouped ActiveRecord::Relation.

This re-implements kaminari#979 using Arel to avoid issues kaminari#1012 and kaminari#1015
Using `Arel.star` so it does not depend on an alias set in the subquery,
as suggested by @guigs on kaminari#1022.
Only count non-null values of column_name if supplied. This meets the
semantics of the SQL count function without requiring that column_name
is one of the grouped columns.
@bendilley
Copy link
Contributor Author

@guigs I have added two new commits to cover your Arel.star suggestion and to account for calling total_count with a column_name - the latter was the most reasonable way I could think to implement the semantics without requiring that column_name is one of the grouped columns.

N.B. I've also rebased this PR onto v1.2.1

@bendilley
Copy link
Contributor Author

A lot of the Travis test failures for Rails 4.2 look like they're related 😞 Hopefully I'll have time to look at them soon...

@bendilley
Copy link
Contributor Author

It seems that Arel::SelectManager.new with no args is unsupported in activerecord 4.2 - it needs an engine but this arg is dropped entirely in more recent versions, i.e. there's no backwards-compatible option for using Arel::SelectManager in this way.

We've established from the reverted #979 that we need to build a clean query, and using ActiveRecord::Relation#arel to acquire a Arel::SelectManager would come with unwanted baggage, so if we want to use Arel for this then that would mean dropping support for activerecord < 5 (any plans for that??)

The other option is to use raw SQL. I might reopen #1023 with a couple of modifications, to test whether that works better...

@guigs
Copy link

guigs commented Nov 2, 2020

How about dynamic checking for ActiveRecord::VERSION::MAJOR >= 5 and have both implementations?

It would make it easier to migrate to preferred solution once support for Active Record < 5 is dropped.

Regarding the logic for column_name parameter, we would need to add a test for that. But, to be honest, I think it should be deprecated unless there is a use case for it. It does not seem so, because currently there's no test for it, and it is already being ignored on the if loaded? case. But this probably should be a separate PR.

Arel::SelectManager.new without args is unsupported in ActiveRecord < 5.
@bendilley
Copy link
Contributor Author

@guigs please see my latest effort with SQL as a fallback for ActiveRecord < 5.

Setting the two solutions side-by-side like this hopefully demonstrates that they really are doing the exact same thing. I'm also just going to throw-in this case against using Arel: it's not technically a public API anymore (see rails/rails#32723).

@guigs
Copy link

guigs commented Nov 3, 2020

@bendilley Fair point. How about

unscoped.from(subquery, 'subquery').count

@bendilley
Copy link
Contributor Author

I think you've come back round to #1012 @guigs 😅

@guigs
Copy link

guigs commented Nov 3, 2020

Very cool! 🤣 Let me check that to understand why it is that not an option...

@guigs
Copy link

guigs commented Nov 3, 2020

Ok, so it also need to remove STI scope, as you are did for #1015. I'm not sure what's the issue with it, beside not being very nice syntax.

@bendilley
Copy link
Contributor Author

I'm not sure what's the issue with it, beside not being very nice syntax.

@yuki24's concern was that AR evidently comes with a lot of baggage and, although unscoped.unscope(:where) is probably the end of it, it's difficult to be certain that there are no edge cases: I thought #1015 had it covered until you pointed out just now that unscope(where: :type) doesn't cover a different inheritance_column!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants