Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order of JOIN statements not respected when building ActiveRecord query with merge and joins #34328

Closed
abinoda opened this issue Oct 27, 2018 · 6 comments · Fixed by #36805
Closed

Comments

@abinoda
Copy link

abinoda commented Oct 27, 2018

I'm building a query with ActiveRecord where I do a LEFT OUTER JOIN and then use merge to do an INNER JOIN. When I pass in GithubPr.joins(:github_user) as the argument to merge, the order of the JOIN statements is not as intended, resulting in a broken SQL query.

Steps to reproduce

Here's my query with a .to_sql to print the generated SQL.

GithubPr
        .select('p', 'count(*)')
        .from("generate_series('2018-08-03', '2018-10-26', '1 week') as dates(p)")
        .joins("LEFT OUTER JOIN github_prs on github_prs.created <= p")
        .merge(GithubPr.joins(:github_user))
        .group('p')
        .order('p ASC')
        .to_sql

Expected behavior

The query I would expect would be consistent with the order of the joins I built. I expect the LEFT OUTER JOIN to be first, followed by the INNER JOIN.

SELECT p, count(*)
FROM generate_series('2018-08-03', '2018-10-26', '1 week') as dates(p)
LEFT OUTER JOIN github_prs on github_prs.created <= p
INNER JOIN "github_users" ON "github_users"."id" = "github_prs"."github_user_id"
GROUP BY p
ORDER BY p ASC

Actual behavior

The query that is generated does not respect the order of my joins. Instead, the INNER JOIN comes before the LEFT OUTER JOIN, which does not work/

SELECT p, count(*)
FROM generate_series('2018-08-03', '2018-10-26', '1 week') as dates(p)
INNER JOIN "github_users" ON "github_users"."id" = "github_prs"."github_user_id"
LEFT OUTER JOIN github_prs on github_prs.created <= p
GROUP BY p
ORDER BY p ASC

Weird workaround

I've found a weird workaround where if I pass in a custom join like this:

merge(joins('INNER JOIN github_users on github_users.id = github_prs.github_user_id'))

instead of:

merge(GithubPr.joins(:github_user))

... then the order is correct and the SQL is valid.

System configuration

Rails version: 5.2.1
Ruby version: 2.5.0

@abinoda abinoda changed the title Order of joins not respected when building ActiveRecord query with merge and joins Order of JOIN statements not respected when building ActiveRecord query with merge and joins Oct 27, 2018
@y-yagi
Copy link
Member

y-yagi commented Oct 27, 2018

Can you create a reproduce script with this template?

@y-yagi y-yagi added the more-information-needed When reporter needs to provide more information label Oct 27, 2018
@abinoda
Copy link
Author

abinoda commented Oct 27, 2018

@y-yagi How's this? It seemed counter-productive to use test assertions on the raw SQL output so I just had the script print out the two examples I mentioned in my original post.

# frozen_string_literal: true

begin
  require "bundler/inline"
rescue LoadError => e
  $stderr.puts "Bundler version 1.10 or later is required. Please update your Bundler"
  raise e
end

gemfile(true) do
  source "https://rubygems.org"

  git_source(:github) { |repo| "https://github.com/#{repo}.git" }

  # Activate the gem you are reporting the issue against.
  gem "activerecord", "5.2.0"
  gem "sqlite3"
end

require "active_record"
require "minitest/autorun"
require "logger"

# This connection will do for database-independent bug reports.
ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:")
ActiveRecord::Base.logger = Logger.new(STDOUT)

ActiveRecord::Schema.define do
  create_table :github_prs, force: true do |t|
    t.integer  :github_user_id
    t.datetime :created
  end

  create_table :github_users, force: true do |t|
  end
end

class GithubPr < ActiveRecord::Base
  belongs_to :github_user
end

class GithubUser < ActiveRecord::Base
end

puts "\n\n"

# join statements are ordered INCORRECTLY if I use merge(GithubPr.joins(:github_user))
puts "Invalid SQL:\n"
puts GithubPr
        .select('p', 'count(*)')
        .from("generate_series('2018-08-03', '2018-10-26', '1 week') as dates(p)")
        .joins("LEFT OUTER JOIN github_prs on github_prs.created <= p")
        .merge(GithubPr.joins(:github_user))
        .group('p')
        .order('p ASC')
        .to_sql

puts "\n\n"

# join statements are ordered CORRECTLY if I use merge(GithubPr.joins('INNER JOIN ...
puts "Correct SQL:\n"
puts GithubPr
        .select('p', 'count(*)')
        .from("generate_series('2018-08-03', '2018-10-26', '1 week') as dates(p)")
        .joins("LEFT OUTER JOIN github_prs on github_prs.created <= p")
        .merge(GithubPr.joins('INNER JOIN github_users on github_users.id = github_prs.github_user_id'))
        .group('p')
        .order('p ASC')
        .to_sql

@rails-bot rails-bot bot removed the more-information-needed When reporter needs to provide more information label Oct 27, 2018
@st0012
Copy link
Contributor

st0012 commented Nov 11, 2018

@abinoda what's the p in the query stands for? Have you tried using left_outer_joins for the left outer join?

@abinoda
Copy link
Author

abinoda commented Nov 11, 2018

@st0012

what's the p in the query stands for?

The p is selecting the date period from generate_series('2018-08-03', '2018-10-26', '1 week') as dates(p).

Have you tried using left_outer_joins for the left outer join?

To my knowledge, you can't pass in a string to left_outer_joins, and there's no other way to join using the conditiongithub_prs.created <= p. left_outer_joins("github_prs on github_prs.created <= p") results in an ArgumentError.

@abuisman
Copy link

abuisman commented Mar 6, 2019

I can confirm the order of joins getting changed:

User.left_joins(:email).joins("INNER JOIN something_cool ON some_prop = haha").to_sql
=> "SELECT \"users\".* FROM \"users\" INNER JOIN something_cool ON some_prop = haha LEFT OUTER JOIN \"emails\" ON \"emails\".\"user_id\" = \"users\".\"id\""

This becomes a problem when you want to use something from the 'emails' table in the ON of the second join, which becomes the first join due to re-ordering.

kamipo added a commit to kamipo/rails that referenced this issue Jul 29, 2019
Currently, string joins are always applied as last joins part, and Arel
join nodes are always applied as leading joins part (since rails#36304), it
makes people struggled to preserve user supplied joins order.

To mitigate this problem, preserve the order of string joins and Arel
join nodes either before or after of association joins.

Fixes rails#36761.
Fixes rails#34328.
Fixes rails#24281.
Fixes rails#12953.
@bananatron
Copy link

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants