-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent execution results #12
Comments
Ok, playing a bit more around this, it feels like this might be caused by the database answering too fast... before the juniper executor gets to resolve the second order, so this might be a non-issue on a real production server. |
After some more thought, this issue can't be related to the database or anything related to my BatchFn impl, When calling the loader in a non-async resolver, things are batched properly 100% of the time, ie: fn user(&self, ctx: &Context) -> Option<types::User> {
let fut = ctx.loaders.user.load(self.user.clone());
tokio::task::spawn(async {
let _user = fut.await.unwrap();
});
None
} So now I'm very suspicious to this line Maybe rescheduling at the event loop is not enouth, juniper might not have called the resolver of the neighbor field yet ? |
Adding a delay after the async fn user(&self, ctx: &Context) -> types::User {
let fut = ctx.loaders.user.load(self.user.clone());
tokio::time::delay_for(std::time::Duration::from_nanos(1)).await;
fut.await.unwrap()
} The whole delay thing inside the pub struct LoadFuture<K, V, E, F> {
id: usize,
stage: Stage,
state: Arc<Mutex<State<K, Result<V, LoadError<E>>, F, BatchFuture<V, E>>>>,
}
impl<K, V, E, F> Future for LoadFuture<K, V, E, F>
where
E: Clone,
F: BatchFn<K, V, Error = E>,
{
type Output = Result<V, LoadError<E>>;
fn poll(mut self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output> {
let state = self.state.clone();
let mut st = state.lock().unwrap();
if st.loaded_vals.contains_key(&self.id) {
self.stage = Stage::Finished;
return Poll::Ready(st.loaded_vals.remove(&self.id).unwrap());
}
if let Some(batch_id) = st.loading_ids.get(&self.id) {
let batch_id = *batch_id;
ready!(st.poll_batch(cx, batch_id));
self.stage = Stage::Finished;
return Poll::Ready(st.loaded_vals.remove(&self.id).unwrap());
}
let batch_id = st.dispatch_new_batch(self.id);
ready!(st.poll_batch(cx, batch_id));
self.stage = Stage::Finished;
return Poll::Ready(st.loaded_vals.remove(&self.id).unwrap());
}
} |
From what I understand, the best place to put a delay to push back on the execution stack is in juniper itself here, right before the Does somebody have any thoughts on this ? |
Instead of the delay, I would propose https://docs.rs/tokio/0.2.13/tokio/task/fn.yield_now.html I literally just had a look at this crate 10min ago so I am not ready to comment more but I'd also be interested in consistent behavior with async. I don't see the conceptional need for non-determinism there. (But I might be wrong). |
I released new version 0.8 using async-await and some breaking change. see if you want to try it out. |
Ok I tested with 0.8 Also the new API using pub struct ModelBatcher;
impl<T> BatchFn<ObjectId, Option<T>> for ModelBatcher
where
T: Model,
{
type Error = ();
fn load<'a, 'b, 'c>(
&'a self,
keys: &'b [ObjectId],
) -> Pin<Box<dyn Future<Output = HashMap<ObjectId, Result<Option<T>, Self::Error>>> + Send + 'c>>
where
'a: 'c,
'b: 'c,
Self: 'c,
{
log::debug!("load batch {:?}", keys);
Box::pin(async move {
let models = T::load_many(&keys).await;
models
})
}
} which you'll agree feels really messy |
Thanks for testing, #[async_trait]
impl<T> BatchFn<ObjectId, Option<T>> for ModelBatcher
where
T: Model + 'static
{
type Error = ();
async fn load(&self, keys: &[ObjectId]) -> HashMap<ObjectId, Result<Option<T>, Self::Error>> {
println!("load batch {:?}", keys);
T::load_many(&keys).await
}
} for the loader part, could you share your test code? |
For the Here is a reproduction of the bug: https://github.com/IcanDivideBy0/dataloader-bug |
I really think the issue is more related to how juniper resolve its tree. Everything looks like once a future is resolved, juniper doesn't wait for all futures of the same level to be completed to continue descending the tree. |
@IcanDivideBy0 thanks for the demo project to reproduce. for lifetime issue, updated BatchFn in master branch, now can add #[async_trait]
impl<T> BatchFn<ObjectId, Option<T>> for ModelBatcher
where
T: Model,
{
type Error = ();
async fn load(&self, keys: &[ObjectId]) -> HashMap<ObjectId, Result<Option<T>, Self::Error>>
where
T: 'async_trait,
{
println!("load batch {:?}", keys);
T::load_many(&keys).await
}
} |
@cksac thanks! BatchFn way more pleasant to use with generics now :) 👍 |
I update the examples to call loader at same level. |
update the example to use dataloader version 0.6.0 / 0.7.0, and it works as expected. |
That's good news! It looks like that's something that can be fixed here finally |
When spamming the request w/ v0.6.0, from times to times, the issue can still occurs. Try add something like this in the
then keep Ctrl+Enter pressed in GraphiQL, in about 20 seconds I get a crash with:
|
Sorry I can't get a more consistent way to reproduce this... |
I manage to fix and for the size of batch, I think the load batch size send to loader between 1 to max_batch_size is expected. as the loader will perform load when caller demand the value. |
@cksac great, the multiple yields trick did the job! although it feels incredibly hacky I don't see any other way. |
I've noticed that sometime batches works, and some times not... I'm using a very simple schema at the moment, sending the same request multiple time can give different results in how the loader is executed:
I'm using a loader on the
orders.user
field:and the
UserLoader
is basically a copy past of the example in juniper docdb contains 2 orders, both having the same user field. Here is some logs of the same request send multiple times:
We can see that some times load batch is called with 2 ids, some times called twice with the same id.
More infos:
threaded_scheduler
andbasic_scheduler
with the same result)I can provide more code if necessary, or maybe even a simple repo to reproduce the issue.
The text was updated successfully, but these errors were encountered: