Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistent results in case of LEFT OUTER JOIN, GROUP BY and HAVING #1107

Open
x2es opened this issue Aug 4, 2023 · 2 comments
Open

inconsistent results in case of LEFT OUTER JOIN, GROUP BY and HAVING #1107

x2es opened this issue Aug 4, 2023 · 2 comments

Comments

@x2es
Copy link

x2es commented Aug 4, 2023

Was working on Rails 5.1
Stop working on Rails 5.2+

Key difference:
on 5.1 was used extra DISTINCT id query and passed exact IDs to query
on 5.2 used LIMIT + OFFSET

It works wrong on dataset with LEFT OUTER JOIN. See details below

I am challenging Rails v5.2.8.1, but future versions broken as well.

Following test suite reproduces my issue for different Rails versions

begin
  require "bundler/inline"
rescue LoadError => e
  $stderr.puts "Bundler version 1.10 or later is required. Please update your Bundler"
  raise e
end

gemfile(true) do
  source "https://rubygems.org"
  # Activate the gem you are reporting the issue against.
  # gem "railties", "5.1.7"       # WORKS
  # gem "activerecord", "5.1.7"   # WORKS

  # TARGET VERSION
  gem "railties", "5.2.8.1"       # NOT WORKING
  gem "activerecord", "5.2.8.1"   # NOT WORKING
 
  # gem "railties", "~> 6.0.0"        # NOT WORKING
  # gem "activerecord", "~> 6.0.0"    # NOT WORKING

  # gem "railties", "~> 6.1.0"        # NOT WORKING
  # gem "activerecord", "~> 6.1.0"    # NOT WORKING

  # gem "railties", "~> 7.0.0"        # NOT WORKING
  # gem "activerecord", "~> 7.0.0"    # NOT WORKING

  gem "sqlite3"
  gem "kaminari-core", "1.2.2"
  gem "kaminari-activerecord", "1.2.2"
  gem "pry-byebug"
end

require "active_record"
require "minitest/autorun"
require "logger"

require "kaminari/core"
require "kaminari/activerecord"

# Ensure backward compatibility with Minitest 4
Minitest::Test = MiniTest::Unit::TestCase unless defined?(Minitest::Test)

# This connection will do for database-independent bug reports.
ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:")
ActiveRecord::Base.logger = Logger.new(STDOUT)

ActiveRecord::Schema.define do
  create_table :posts, force: true do |t|
  end

  create_table :comments, force: true do |t|
    t.integer :post_id
  end
end

class Post < ActiveRecord::Base
  has_many :comments
end

class Comment < ActiveRecord::Base
  belongs_to :post
end

class BugTest < Minitest::Test
  #
  # Was working on Rails 5.1
  # Broken on Rails 5.2+
  #
  # GIVEN two Posts
  #   first with two Comments
  #   second without any Comments
  #
  # WHEN query 
  #   with LEFT OUTER JOIN "comments"
  #   and GROUP BY posts.id, comments.id HAVING (COUNT(comments.id) > 0)
  #
  #   scope = Post.eager_load(:comments).group('posts.id, comments.id').having('COUNT(comments.id) > 0')
  #
  # RESULT
  #
  #   raw dataset contains two records belongs to single Post
  #
  #     ActiveRecord::Base.connection.execute(scope.to_sql)
  #     => [
  #          {"t0_r0"=>1, "t1_r0"=>1, "t1_r1"=>1}, 
  #          {"t0_r0"=>1, "t1_r0"=>2, "t1_r1"=>1}
  #        ]
  #
  #        # tip: "posts"."id" AS t0_r0, "comments"."id" AS t1_r0, "comments"."post_id" AS t1_r1
  #
  #   ActiveRecord relation contains single item - Post
  #
  #     scope
  #     => [#<Post:0x00007f9d542141b8 id: 1>]
  #
  # EXPECTED
  #
  #   Pagination should treat Posts, not raw records
  #
  #   scope.page(1).per(1).total_count to be 1
  #   scope.page(1).per(1).total_pages to be 1
  #   scope.page(1).per(1) to be [<Post id:1>]
  #   scope.page(2).per(1) to be []
  #
  # ACTUAL
  #
  #   Pagination treats raw dataset
  #
  #   scope.page(1).per(1).total_count is 2
  #   scope.page(1).per(1).total_pages is 2
  #   scope.page(2).per(1) is [<Post#inst1 id:1>]
  #   scope.page(2).per(1) is [<Post#inst2 id:1>]
  #
  # TIPS
  #
  #   Rails 5.1 makes extra query with `SELECT  DISTINCT "posts"."id"`
  #   and then using exact post.id as WHERE criteria
  #
  #   Rails 5.2 using LIMIT + OFFSET
  #
  #   ---- Rails 5.1 --------------------------------------------
  #   [1] pry(#<BugTest>)> Rails.version
  #   => "5.1.7"
  #   [2] pry(#<BugTest>)> puts scope.page(1).per(1).to_sql
  #   D, [2023-08-04T12:06:22.125631 #3501021] DEBUG -- :   SQL (1.0ms)  SELECT  DISTINCT "posts"."id" FROM "posts" LEFT OUTER JOIN "comments" ON "comments"."post_id" = "posts"."id" GROUP BY posts.id, comments.id HAVING (COUNT(comments.id) > 0) LIMIT ? OFFSET ?  [["LIMIT", 1], ["OFFSET", 0]]
  #   SELECT "posts"."id" AS t0_r0, "comments"."id" AS t1_r0, "comments"."post_id" AS t1_r1 FROM "posts" LEFT OUTER JOIN "comments" ON "comments"."post_id" = "posts"."id" WHERE "posts"."id" = 1 GROUP BY posts.id, comments.id HAVING (COUNT(comments.id) > 0)
  #   => nil
  #   [3] pry(#<BugTest>)> puts scope.page(2).per(1).to_sql
  #   D, [2023-08-04T12:06:58.975675 #3501021] DEBUG -- :   SQL (1.0ms)  SELECT  DISTINCT "posts"."id" FROM "posts" LEFT OUTER JOIN "comments" ON "comments"."post_id" = "posts"."id" GROUP BY posts.id, comments.id HAVING (COUNT(comments.id) > 0) LIMIT ? OFFSET ?  [["LIMIT", 1], ["OFFSET", 1]]
  #   SELECT "posts"."id" AS t0_r0, "comments"."id" AS t1_r0, "comments"."post_id" AS t1_r1 FROM "posts" LEFT OUTER JOIN "comments" ON "comments"."post_id" = "posts"."id" WHERE (1=0) GROUP BY posts.id, comments.id HAVING (COUNT(comments.id) > 0)
  #   => nil
  #   -----------------------------------------------------------
  #
  #
  #   ---- Rails 5.2 --------------------------------------------
  #   [1] pry(#<BugTest>)> Rails.version
  #   => "5.2.8.1"
  #   [2] pry(#<BugTest>)> puts scope.page(1).per(1).to_sql
  #   SELECT  "posts"."id" AS t0_r0, "comments"."id" AS t1_r0, "comments"."post_id" AS t1_r1 FROM "posts" LEFT OUTER JOIN "comments" ON "comments"."post_id" = "posts"."id" GROUP BY posts.id, comments.id HAVING (COUNT(comments.id) > 0) LIMIT 1 OFFSET 0
  #   => nil
  #   [3] pry(#<BugTest>)> puts scope.page(2).per(1).to_sql
  #   SELECT  "posts"."id" AS t0_r0, "comments"."id" AS t1_r0, "comments"."post_id" AS t1_r1 FROM "posts" LEFT OUTER JOIN "comments" ON "comments"."post_id" = "posts"."id" GROUP BY posts.id, comments.id HAVING (COUNT(comments.id) > 0) LIMIT 1 OFFSET 1
  #   => nil
  #   -----------------------------------------------------------
  #

  def test_pagination_over_outer_join
    post_with_comments = Post.create!
    post_with_comments.comments << Comment.create!
    post_with_comments.comments << Comment.create!

    post_without_comments = Post.create!

    scope = Post.eager_load(:comments).group('posts.id, comments.id').having('COUNT(comments.id) > 0')
    scope_paginated = scope.page(1).per(1)

    binding.pry

    assert_includes scope_paginated, post_with_comments
    assert_empty scope.page(2).per(1), 'page 2 should be empty'  # TARGET CASE
    # assert_equal 1, scope_paginated.total_pages, 'total_pages should be 1'
    # assert_equal 1, scope_paginated.total_count, 'total_count should be 1'
  end
end
@x2es
Copy link
Author

x2es commented Aug 4, 2023

Some related issues:

#936
#1031

Maybe needed some warn/error similar to #1054

@arianf
Copy link

arianf commented Mar 5, 2024

I noticed something similar to this.

When I had duplicate results within my query, the total_count would calculation would early exit.

This seemed to be because of this:

# Total count is calculable at the last page
return @total_count = offset_value + @records.length if @records.any? && (@records.length < limit_value)

Which would early exit if @records.length < limit_value. It seemed like for example if you set your limit to 25 per page, that if your query has duplicates, those will get removed somewhere upstream, and it would result in 21 records for the first page. Then because that is less than the limit_value, Kaminari thinks it's on the last page, and doesn't do a correct total_count.

@yuki24 @tumayun I think was was introduced by: #932

Screenshot 2024-03-05 at 12 30 18 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants