New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AsyncDataloader freezes #4767
Comments
Hi! Yes, I don't think it freezes for any operation that uses a Dataloader source, but maybe we can narrow down the problem and find out why it's not working for you. Could you please share:
That will give a lot of information about what's going on when it freezes and maybe we can work up a replication script to figure out what's going wrong 👍 |
Here's the stack trace:
The query: query {
currentUser {
profile { id }
}
} The relevant code: # user_type.rb
class UserType
field :profile, ProfileType, description: "for members only. Associated profile data"
# ... omited irrelevant code
def profile
context.dataloader.with(Sources::ProfileForUser).load(object.id)
end
end
# profile_for_user.rb
class Sources::ProfileForUser < GraphQL::Dataloader::Source
def fetch(users_ids)
profiles = Profile.where(user_id: users_ids)
users_ids.map do |user_id|
puts "about to do 'puts caller' and then call profiles.find"
puts caller
profiles.find { |p| p.user_id == user_id }
end
end
end The profiles.find call is the one that triggers the lazy evaluation of the SQL query in active record, so I am thinking the problem might not be in graphql-ruby. However using active record in other ways seems to work |
I actually noticed it didn't freeze. It eventually finished executing It took over 3 minutes. This HTTP request takes around 90ms normally EDIT: also I notice that when it "freezes", one logical CPU thread of my machine is at 100% and the process causing it is the puma web server. So something weird is happening in ruby. In the logs, active record reports the queries as taking only a few ms |
Interesting -- thanks for sharing all those details. My first thought was, maybe it was stuck in the graphql-ruby/lib/graphql/dataloader/async_dataloader.rb Lines 38 to 43 in d93e2fb
But, I see that there's a Unfortunately my suggestion for I'm going to try to replicate this locally. In the meantime, I'm wondering if you could try a few experiments to help us isolate exactly where it's going wrong:
Let me know what you find -- I'll try to replicate on my end in the meantime! |
I wrote a replication script, but it worked fine: Querying for currentUser.profile.id with `AsyncDataloader`
require 'bundler/inline'
gemfile do
source "https://rubygems.org"
gem "graphql", "2.2.4"
gem "rails", "7.1.2"
gem "postgresql"
gem "async", "~>2.8.0"
end
require "active_record"
ActiveSupport::IsolatedExecutionState.isolation_level = :fiber
ActiveRecord::Base.establish_connection("postgres://postgres:@localhost/postgres")
ActiveRecord::Schema.define do
self.verbose = false
create_table :profiles, force: true do |t|
t.integer :user_id
end
end
class Profile < ActiveRecord::Base
end
Profile.create!(user_id: 5)
class MySchema < GraphQL::Schema
class ProfileSource < GraphQL::Dataloader::Source
def fetch(users_ids)
profiles = ::Profile.where(user_id: users_ids)
users_ids.map do |uid|
profiles.find { |pr| pr.user_id == uid }
end
end
end
class Profile < GraphQL::Schema::Object
field :id, ID
end
class User < GraphQL::Schema::Object
field :profile, Profile
def profile
dataloader.with(ProfileSource).load(object.id)
end
end
class Query < GraphQL::Schema::Object
field :current_user, User
end
query(Query)
use GraphQL::Dataloader::AsyncDataloader
end
query_str = <<-GRAPHQL
{
currentUser {
profile {
id
}
}
}
GRAPHQL
data = { current_user: OpenStruct.new(id: 5) }
pp MySchema.execute(query_str, root_value: data).to_h
# {"data"=>{"currentUser"=>{"profile"=>{"id"=>"1"}}}} So... we have to find some more "moving parts" to identify what's going wrong here! |
@rmosolgo I ran those tests Remove the dataloader.with(...).load(...) call, and instead, call Profile.find directly - Works When using dataloader.with(...).load(...), what if you change def fetch to call users_ids.map { |id| Profile.find(id) } directly? FAIL For this one I used this implementation: class Sources::ProfileForUser < GraphQL::Dataloader::Source
def fetch(users_ids)
users_ids.map do |user_id|
Profile.find_by(user_id:)
end
end
end And it doesn't work, it gets stuck on line 4: here is a snapshot while the query is running:
I am not getting any insight from this snapshot, however to be honest the ruby ecosystem never really clicked with me so I guess you will need to be patient 😄 |
Ugh, bummer -- the I'm not exactly sure where to look next. I'd say the Ruby ecosystem did click with me, but using Fibers this way is pretty new! I'm going to think about more debugging options and follow up here. In the meantime, I documented some other approaches for adding parallelism to |
Describe the bug
When using AsyncDataloader instead of the normal one, operations that involve a Source will freeze and the HTTP request never completes.
Versions
graphql
version: 2.2.4rails
(or other framework): 7.1.2graphql-pro
version: 1.25.2puma
version: 6.4async
version: 2.8GraphQL schema
This seems to happen when using any kind of dataloader source.
GraphQL query
Any query that involves using a dataloader source has this problem.
Steps to reproduce
Enable the async dataloader on a rails 7.1 project that uses dataloader sources, then try doing a query that involves reading from a source
Expected behavior
The query works correctly, as it does with the regular dataloader.
Actual behavior
The HTTP request and ActiveRecord read freeze. On PostgreSQL, the query appears stuck with wait status ClientRead when checking pg_stat_activity.
On the rails process, the HTTP request is never resolved and the ruby code is frozen in
results.find { |r| r.id == id }
in thefetch
method of the Source.Additional context
The problem occurs regardless of isolation level being set to fiber or not. It only occurs with AsyncDataloader and never occurs with the regular Dataloader.
I looked at other issues and followed the development of the feature and it seems it works for other people, so I am looking for some help to figure out why it does not work for us
The text was updated successfully, but these errors were encountered: