Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide examples for async use of Hub #507

Open
peterstuart opened this issue Oct 13, 2022 · 7 comments
Open

Provide examples for async use of Hub #507

peterstuart opened this issue Oct 13, 2022 · 7 comments

Comments

@peterstuart
Copy link

I recently had an issue (events/spans being recorded in the wrong places) which was caused by me not understanding that I needed to be manually cloning/creating Hubs and using bind_hub on futures which I was joining/spawning/etc. (See support request 72484 if you're interested). The docs do explain that, but:

  • I naively assumed that sentry-tracing would deal with that for me, since it already has the concepts of spans which work with futures
  • the docs are pretty light on details about exactly what I needed to do

I think it'd be helpful to provide an example in the repo of manually using bind_hub, and I think it might be worth linking to something about working with async code more prominently (the front page of the docs)?

I'm also not totally clear on exactly how to manually deal with Hubs. Should I be using Hub::new_from_top vs Hub::new, or just reusing Hub::current? (I don't understand what "the top scope of another hub" means in the docs for new_from_top.) Hub::new_from_top(Hub::current()) seems to be working the way I'd expect so far. 🤷🏻

Anyway, thanks for the great SDK. I'm very happy with how seamless it's been to use (especially with the tower and tracing integrations). Just thought I'd report the troubles I ran into.

@Swatinem
Copy link
Member

Hi! Yes the support request came through me as well.

I understand this is a confusing topic, as the ecosystem itself exposes some non-obvious footguns around this concept.

The upstream tracing docs also have a related warning here: https://docs.rs/tracing/latest/tracing/span/struct.Span.html#in-asynchronous-code
I also blogged about some related issues here: https://swatinem.de/blog/non-lazy-futures/ and https://swatinem.de/blog/log-contexts/


Either way, there are two general patterns here:

If you spawn tasks, or use futures::join concurrency, you have to always create a new hub (new_from_top, aka "inherit" the current scope properties and breadcrumbs).

As those tasks can run concurrently (or in parallel on multiple threads), they need their own copy of the Hub/Scope so they don’t interfere with each other.

The only exception from this rule is if you immediate await a single spawned task. In that case it is okay (and likely desired) that the spawned task will reuse the same scope and breadcrumbs as the "caller", as there is little difference from spawn/await and a normal function call.
If the spawned task outlives its "caller", it should get its own Hub however.


As these are indeed footguns and hard concepts to teach, I would appreciate feedback on how we could improve the docs in this regard. From your perspective, is the explanation above sufficient? Where should these things be explained? The main Hub docs? Maybe a specific section somewhere that talks about async code? This is also not specific to async, thread::spawn concurrency has the same problem, its just a lot more obvious when dealing with async code.

@peterstuart
Copy link
Author

If you spawn tasks, or use futures::join concurrency, you have to always create a new hub (new_from_top, aka "inherit" the current scope properties and breadcrumbs).

One other time I found I needed to create a new hub was when processing a stream in parallel:

let hub = Hub::current();

let x: Vec<_> = stream::iter(inputs)
    .map(|input| {
        async move { Self::do_something(input, pool).await }
            .bind_hub(Hub::new_from_top(&hub))
    })
    .buffered(10)
    .try_collect()
    .await?;

The only exception from this rule is if you immediate await a single spawned task. In that case it is okay (and likely desired) that the spawned task will reuse the same scope and breadcrumbs as the "caller", as there is little difference from spawn/await and a normal function call.

Ah, interesting—I didn't realize that awaiting a spawned task would run it within the context of the current future.

As these are indeed footguns and hard concepts to teach, I would appreciate feedback on how we could improve the docs in this regard. From your perspective, is the explanation above sufficient? Where should these things be explained? The main Hub docs? Maybe a specific section somewhere that talks about async code? This is also not specific to async, thread::spawn concurrency has the same problem, its just a lot more obvious when dealing with async code.

I think the explanation above is great. Adding it to the Hub docs makes sense to me, although I think additionally pointing to that from somewhere more prominent (front page of the docs?) would be helpful too. "If you're writing async or threaded code, see the Hub documentation"? Maybe I should have read the docs more thoroughly to start with, but it took me a long time to figure out that the Hub docs were where I should be looking, because I hadn't actually used Hubs at all manually, since the tracing and tower integrations had been sufficient for everything else I needed to do.

Thanks again!

@Swatinem
Copy link
Member

Ah, interesting—I didn't realize that awaiting a spawned task would run it within the context of the current future.

Actually, I said the wrong thing here :-D You do have to bind a hub, but you can bind the current hub, instead of a new one.
Hubs are refcounted and have internal mutability anyway, so you can bind the same copy to the spawned future. It does not need to be a new one.

I will try to add some more docs to the hub overview and main page.

@Swatinem
Copy link
Member

I needed to create a new hub was when processing a stream in parallel:

Well yes, because parallelism / concurrency means the tasks are running independently.

@Swatinem
Copy link
Member

#509 is a try at improving the docs around this with some examples.
Also while creating the examples, I noticed (yet again) that our Hub APIs are extremely cumbersome to use.
I will try to get more clarity around these things internally and see if we can improve things.

@gautamg795
Copy link

Hi @Swatinem, sorry to "hijack" this thread but I think I'm still using bind_hub incorrectly, as I'm seeing unexpected results in the error events on Sentry when a task spawned from an HTTP request fails.

Specifically we encounter this with websocket handlers in Axum — the server itself follows the example to use the sentry-tower layers (e.g. svc.layer(sentry_tower::NewSentryLayer::<_>::new_from_top()).layer(sentry_tower::SentryHttpLayer::new()))

In the websocket handling method, I try to create a new hub and bind it to the future that handles the websocket upgrade:

async fn ws_handler(
    ws: axum::extract::ws::WebSocketUpgrade,
    ...
) -> Result<impl IntoResponse, MyError> {
    let hub = sentry::Hub::new_from_top(sentry::Hub::current());
    Ok(ws.on_upgrade(|ws| handle_websocket(ws, ...).bind_hub(hub)))
}

The first time something errors inside handle_websocket, everything looks great — the request path and headers show up on Sentry as expected. However, errors in subsequent requests have the same metadata — same request path, headers, etc — implying the scope isn't getting properly popped, or the hub is being nested incorrectly (I don't totally understand).

As far as I can tell, this follows the examples you've provided above and in the docs, so any help would be appreciated!

@Swatinem
Copy link
Member

errors in subsequent requests have the same metadata

I assume that is for every new request that goes through the websocket upgrade?
I haven’t had a chance to try websockets with axum yet though, so no idea how that works in detail. The code you presented does look correct.
How does the rest of your Router and service layer configuration look like?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants