Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of total_count for grouped queries #979

Merged

Conversation

MmKolodziej
Copy link
Contributor

Code for #978.

SQL query will return a single count instead of forcing AR to serialize a hash which entries will be counted.

@MmKolodziej MmKolodziej force-pushed the better_total_count_for_grouped_queries branch from 0f16250 to 642b535 Compare January 14, 2019 22:28
@yuki24
Copy link
Member

yuki24 commented Jan 15, 2019

Thanks for the PR - I should be able to look into this next week (I'm taking an OSS time off). Also you don't have to worry about the failing tests as they should be fixed upstream.

@yuki24 yuki24 merged commit b5e2e07 into kaminari:master Jan 25, 2019
@yuki24
Copy link
Member

yuki24 commented Jan 25, 2019

Thanks!

@yuki24
Copy link
Member

yuki24 commented Jan 25, 2019

Actually, the test failures look legit. I'll look into them to see if there's a quick fix.

@yuki24
Copy link
Member

yuki24 commented Jan 25, 2019

I have no idea why the master build is green, but if it's green, then good. Maybe there was something odd in Travis.

bendilley added a commit to skillstream/kaminari that referenced this pull request Feb 20, 2020
When querying a subtype on a STI table, a 'hard scope' is applied that
is incompatible with the use of `model.from(subquery)`

This fix removes any type restriction from the _outer_ query (because
the subquery already has the restriction).

Introduced by kaminari#979
yuki24 added a commit that referenced this pull request Feb 29, 2020
…or_grouped_queries"

This reverts commit b5e2e07, reversing
changes made to d1c26fd.

We merged #979 as a performance improvement, but it turns out it
introduced a couple of bugs. The solution seems to be a little tricky as
it has to call `.unscoped` and `unscope(where: :type)`, so let's revert
it until we find a better solution.

closes #1012, #1015.
@yuki24
Copy link
Member

yuki24 commented Feb 29, 2020

Sadly I had to revert this commit: 04d86ed as it introduced a couple of bugs. The workaround seems a bit tricky as it has to call unscoped and unscope(where: :type), and this seems to be running into a rabbit hole.

I still like this PR though. I added a few tests from #1012 and #1015 so next time we try we should be able to catch bugs early.

@bendilley
Copy link
Contributor

I'd like to add my +1 for this PR. Between this, #1012 and #1015, I do believe it's a good performance optimisation for larger result sets. In my opinion it's just ActiveRecord's syntax that makes it look hacky,

c.model.unscoped.unscope(where: :type).from(c.except(:select).select("1")).count

but actually we're just ensuring that the outer query is unpolluted with AR cleverness while the inner query c.except(:select).select("1") is tucked-in safe-and-sound. It's no more convoluted than the preceding logic in total_count to be fair.

I'd really like to see a commit which rebuilds this solution with all the accompanying tests.

@MmKolodziej
Copy link
Contributor Author

Well, first of all: Oopsy 😅
The two issues related to the improved counting make me a bit suspicious that unscoping two things won't yield more errors, because there might be even more AR magic.

I'm gonna try to find a bit of time to play around with arel to solve this one, as I believe it should make it easier to write a query that won't be contaminated by default conditions added to the model.

@bendilley
Copy link
Contributor

bendilley commented Mar 2, 2020

Of course! We only need the total count so that's a great idea to use arel to solve this.

@ChrisBAshton
Copy link

Thanks to everyone for their efforts so far. Is there a roadmap for the next release or could this be released as a patch 1.2.1?

@yuki24
Copy link
Member

yuki24 commented Mar 18, 2020

@ChrisBAshton It would be great if you could give the master branch a try and see if there is any issue with that. I haven't run into this issue personally so it's a bit difficult to make sure it's fixed. We should be able to cut a release once we are confident about this.

@ChrisBAshton
Copy link

Hi @yuki24 I've just tested against master and am still getting the error, unfortunately. It may be worth looking at #1012 again, as that seems to fix the regression.

@yuki24
Copy link
Member

yuki24 commented Mar 23, 2020

@ChrisBAshton Thanks for reporting. It seems like your app is open-source, so I should be able to look into it on my free time. In the mean time, it would be great if you could share example code or test and the error you are seeing.

@benthorner
Copy link

@yuki24 just following up on this - I work with @ChrisBAshton. I've tested this against the current master branch (04d86ed), and the issue appears to be resolved 🎉.

Hopefully there will be a patch version soon 🙏. In case it's helpful, the issue we're seeing is the same as the one in this PR. In our case:

I agree that #1012 would have been a surprise for us, and reverting the original cause of the bug seems like the best approach for now 👍.

@JaredRoth
Copy link

We're experiencing the exact same problem as @benthorner and @ChrisBAshton. Is there a plan to release that reversion to rubygems any time soon?

@yuki24
Copy link
Member

yuki24 commented Apr 23, 2020

We've just talked about cutting a new version soon. Stay tuned and thanks for your patience!

bendilley added a commit to skillstream/kaminari that referenced this pull request May 13, 2020
Improved performance for total_count on grouped ActiveRecord::Relation.

This re-implements kaminari#979 using Arel to avoid issues kaminari#1012 and kaminari#1015
bendilley added a commit to skillstream/kaminari that referenced this pull request May 13, 2020
Improved performance for total_count on grouped ActiveRecord::Relation.

This re-implements kaminari#979 using plain SQL to avoid issues kaminari#1012 and kaminari#1015
bendilley added a commit to skillstream/kaminari that referenced this pull request May 13, 2020
Improved performance for total_count on grouped ActiveRecord::Relation.

This re-implements kaminari#979 using Arel to avoid issues kaminari#1012 and kaminari#1015
@bendilley
Copy link
Contributor

@MmKolodziej I took your idea to rewrite it in Arel and ran with it, came up with #1022. I then realised that Arel was overkill and so I've turned-in a plain SQL version too, #1023, which is easier on the eye (and brain!)

🤞 hoping one of these solutions will make it in.

bendilley added a commit to skillstream/kaminari that referenced this pull request May 29, 2020
Improved performance for total_count on grouped ActiveRecord::Relation.

This re-implements kaminari#979 using plain SQL to avoid issues kaminari#1012 and kaminari#1015
@kitsunde
Copy link

Arel solves things beyond just constructing the query. It allows the statement to continue to be prepared. When we cast active record with .to_sql it becomes raw SQL which will no longer be bound. This then creates a couple of issues:

  • Query planning can be the majority of the execution time. Our hot-path queries have 2/3 of the execution as query planning time in PG.
  • In monitoring and profiling tools, these statements will have variance and become difficult to get aggregate statistics on.

@bendilley
Copy link
Contributor

@kitsunde does that make #1022 the better solution, in your opinion?

@guigs
Copy link

guigs commented Oct 9, 2020

@kitsunde made a very good point. Based on this I also think the Arel solution is better.
I even did some tests comparing them and confirm that prepared statement works with Arel but not with to_sql.

@kitsunde
Copy link

kitsunde commented Oct 9, 2020

It's definitely a better solution. While I'm relatively familiar with Arel I'm not familiar with c.model.connection.select_value though.

What I would want to know is:

  1. Does all the binds carry through.
  2. Are we still re-using the prepared statement.

Because if 2 doesn't happen, then repeat execution will just split the overhead and make it worse than raw SQL.

bendilley added a commit to bendilley/kaminari that referenced this pull request Oct 29, 2020
Improved performance for total_count on grouped ActiveRecord::Relation.

This re-implements kaminari#979 using Arel to avoid issues kaminari#1012 and kaminari#1015
@bendilley
Copy link
Contributor

@kitsunde to answer your questions:

  1. Yes, all the binds carry through
  2. No, we're not reusing the statement...

select_value executes a query that returns a single column of a single row, returning its value. Because the total_count method's purpose is to return only an integer value, the query we build to acquire that value is then (as far as I can see) discarded.

Anyway, I closed #1023 because it clearly didn't smell right to anyone, and have updated #1022 in response to @guigs's comments.

@guigs
Copy link

guigs commented Nov 3, 2020

I think

unscoped.unscope(:where).from(c.except(:select).select('1')).count

would be a good solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants