Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading to 1.2.0 breaks total_count for distinct subqueries #1016

Closed
oehlschl opened this issue Feb 25, 2020 · 5 comments
Closed

Upgrading to 1.2.0 breaks total_count for distinct subqueries #1016

oehlschl opened this issue Feb 25, 2020 · 5 comments

Comments

@oehlschl
Copy link

Encountered this issue when upgrading from 1.1.1 to 1.2.0, and it seems similar to #1015; likely also related to change #979.

The query below is a little contrived, but it demonstrates the issue and this test passes in 1.1.1 but fails in 1.2.0:

test 'total_count with distinct and group in subquery' do
   assert_equal 3, User.joins(:authorships).select('COUNT(*)').distinct.group(:id).order('COUNT(*) DESC').page(1).total_count
end

In 1.1.1, the code above issues the following query:

SELECT COUNT(DISTINCT "users"."id") AS count_id, "users"."id" AS users_id FROM "users" INNER JOIN "authorships" ON "authorships"."user_id" = "users"."id" GROUP BY "users"."id"

In 1.2.0, it becomes:

SELECT COUNT(*) FROM (SELECT DISTINCT 1 FROM "users" INNER JOIN "authorships" ON "authorships"."user_id" = "users"."id" GROUP BY "users"."id") subquery

Effectively, the distinct and select 1 don't work in combination. Admittedly probably an edge case but I thought it worth reporting nonetheless.

Thanks!

@yuki24
Copy link
Member

yuki24 commented Feb 29, 2020

@oehlschl I just reverted #979 as an attempt to fix the count-related issue. Could you try using the master branch on GitHub so we can be more confident about the fix?

@bendilley
Copy link
Contributor

I understand why this presents as a bug but I would contest that using DISTINCT and GROUP BY in the same SQL query (whether you're building that using AR or not) is likely to indicate/result in problems anyway...

Let's think of DISTINCT as a lazy GROUP BY all_of, the, columns (please correct me if this assertion is wrong!). If you then add GROUP BY specific_column(s) to a query that uses DISTINCT, that DISTINCT becomes redundant: GROUP BY will always give you distinct results, by definition.

I know the example here is contrived, but if I were confronted with

.select('COUNT(*)').distinct.group(:id).order('COUNT(*) DESC')

then I'd immediately say "lose the distinct" - it's not doing anything because you're already grouping by id, so the results will be distinct. While leaving it in is theoretically harmless, it probably indicates an intention which isn't reflected in the outcome (or maybe a refactor has made it redundant, or a scope has been built-up from disparate expressions, etc.). In any case, I'd like to understand the author's intent at this point.

Thinking about it, maybe AR should apply distinct(false) as soon as group is scoped! 🤔

@pjshwa
Copy link

pjshwa commented Apr 14, 2020

I think the revert commit of #979 deserves its own release.

@phlegx
Copy link

phlegx commented Apr 16, 2020

@yuki24 I have the same problem and tried out the master branch. Master branch works again! Please revert #979 and release.

@yuki24
Copy link
Member

yuki24 commented May 28, 2020

Now that v1.2.1 is out this should be fixed now. Thanks for your patience!

@yuki24 yuki24 closed this as completed May 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants