Upgrading to 1.2.0 breaks total_count for distinct subqueries #1016

oehlschl · 2020-02-25T05:52:07Z

Encountered this issue when upgrading from 1.1.1 to 1.2.0, and it seems similar to #1015; likely also related to change #979.

The query below is a little contrived, but it demonstrates the issue and this test passes in 1.1.1 but fails in 1.2.0:

test 'total_count with distinct and group in subquery' do
   assert_equal 3, User.joins(:authorships).select('COUNT(*)').distinct.group(:id).order('COUNT(*) DESC').page(1).total_count
end

In 1.1.1, the code above issues the following query:

SELECT COUNT(DISTINCT "users"."id") AS count_id, "users"."id" AS users_id FROM "users" INNER JOIN "authorships" ON "authorships"."user_id" = "users"."id" GROUP BY "users"."id"

In 1.2.0, it becomes:

SELECT COUNT(*) FROM (SELECT DISTINCT 1 FROM "users" INNER JOIN "authorships" ON "authorships"."user_id" = "users"."id" GROUP BY "users"."id") subquery

Effectively, the distinct and select 1 don't work in combination. Admittedly probably an edge case but I thought it worth reporting nonetheless.

Thanks!

The text was updated successfully, but these errors were encountered:

yuki24 · 2020-02-29T23:38:24Z

@oehlschl I just reverted #979 as an attempt to fix the count-related issue. Could you try using the master branch on GitHub so we can be more confident about the fix?

bendilley · 2020-03-02T11:56:30Z

I understand why this presents as a bug but I would contest that using DISTINCT and GROUP BY in the same SQL query (whether you're building that using AR or not) is likely to indicate/result in problems anyway...

Let's think of DISTINCT as a lazy GROUP BY all_of, the, columns (please correct me if this assertion is wrong!). If you then add GROUP BY specific_column(s) to a query that uses DISTINCT, that DISTINCT becomes redundant: GROUP BY will always give you distinct results, by definition.

I know the example here is contrived, but if I were confronted with

.select('COUNT(*)').distinct.group(:id).order('COUNT(*) DESC')

then I'd immediately say "lose the distinct" - it's not doing anything because you're already grouping by id, so the results will be distinct. While leaving it in is theoretically harmless, it probably indicates an intention which isn't reflected in the outcome (or maybe a refactor has made it redundant, or a scope has been built-up from disparate expressions, etc.). In any case, I'd like to understand the author's intent at this point.

Thinking about it, maybe AR should apply distinct(false) as soon as group is scoped! 🤔

pjshwa · 2020-04-14T12:29:16Z

I think the revert commit of #979 deserves its own release.

phlegx · 2020-04-16T18:42:31Z

@yuki24 I have the same problem and tried out the master branch. Master branch works again! Please revert #979 and release.

yuki24 · 2020-05-28T00:58:17Z

Now that v1.2.1 is out this should be fixed now. Thanks for your patience!

yuki24 closed this as completed May 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrading to 1.2.0 breaks total_count for distinct subqueries #1016

Upgrading to 1.2.0 breaks total_count for distinct subqueries #1016

oehlschl commented Feb 25, 2020

yuki24 commented Feb 29, 2020

bendilley commented Mar 2, 2020

pjshwa commented Apr 14, 2020

phlegx commented Apr 16, 2020

yuki24 commented May 28, 2020

Upgrading to 1.2.0 breaks total_count for distinct subqueries #1016

Upgrading to 1.2.0 breaks total_count for distinct subqueries #1016

Comments

oehlschl commented Feb 25, 2020

yuki24 commented Feb 29, 2020

bendilley commented Mar 2, 2020

pjshwa commented Apr 14, 2020

phlegx commented Apr 16, 2020

yuki24 commented May 28, 2020