Provide examples for async use of `Hub` #507

peterstuart · 2022-10-13T21:51:32Z

I recently had an issue (events/spans being recorded in the wrong places) which was caused by me not understanding that I needed to be manually cloning/creating Hubs and using bind_hub on futures which I was joining/spawning/etc. (See support request 72484 if you're interested). The docs do explain that, but:

I naively assumed that sentry-tracing would deal with that for me, since it already has the concepts of spans which work with futures
the docs are pretty light on details about exactly what I needed to do

I think it'd be helpful to provide an example in the repo of manually using bind_hub, and I think it might be worth linking to something about working with async code more prominently (the front page of the docs)?

I'm also not totally clear on exactly how to manually deal with Hubs. Should I be using Hub::new_from_top vs Hub::new, or just reusing Hub::current? (I don't understand what "the top scope of another hub" means in the docs for new_from_top.) Hub::new_from_top(Hub::current()) seems to be working the way I'd expect so far. 🤷🏻

Anyway, thanks for the great SDK. I'm very happy with how seamless it's been to use (especially with the tower and tracing integrations). Just thought I'd report the troubles I ran into.

The text was updated successfully, but these errors were encountered:

Swatinem · 2022-10-17T14:06:09Z

Hi! Yes the support request came through me as well.

I understand this is a confusing topic, as the ecosystem itself exposes some non-obvious footguns around this concept.

The upstream tracing docs also have a related warning here: https://docs.rs/tracing/latest/tracing/span/struct.Span.html#in-asynchronous-code
I also blogged about some related issues here: https://swatinem.de/blog/non-lazy-futures/ and https://swatinem.de/blog/log-contexts/

Either way, there are two general patterns here:

If you spawn tasks, or use futures::join concurrency, you have to always create a new hub (new_from_top, aka "inherit" the current scope properties and breadcrumbs).

As those tasks can run concurrently (or in parallel on multiple threads), they need their own copy of the Hub/Scope so they don’t interfere with each other.

The only exception from this rule is if you immediate await a single spawned task. In that case it is okay (and likely desired) that the spawned task will reuse the same scope and breadcrumbs as the "caller", as there is little difference from spawn/await and a normal function call.
If the spawned task outlives its "caller", it should get its own Hub however.

As these are indeed footguns and hard concepts to teach, I would appreciate feedback on how we could improve the docs in this regard. From your perspective, is the explanation above sufficient? Where should these things be explained? The main Hub docs? Maybe a specific section somewhere that talks about async code? This is also not specific to async, thread::spawn concurrency has the same problem, its just a lot more obvious when dealing with async code.

peterstuart · 2022-10-17T19:19:52Z

If you spawn tasks, or use futures::join concurrency, you have to always create a new hub (new_from_top, aka "inherit" the current scope properties and breadcrumbs).

One other time I found I needed to create a new hub was when processing a stream in parallel:

let hub = Hub::current();

let x: Vec<_> = stream::iter(inputs)
    .map(|input| {
        async move { Self::do_something(input, pool).await }
            .bind_hub(Hub::new_from_top(&hub))
    })
    .buffered(10)
    .try_collect()
    .await?;

The only exception from this rule is if you immediate await a single spawned task. In that case it is okay (and likely desired) that the spawned task will reuse the same scope and breadcrumbs as the "caller", as there is little difference from spawn/await and a normal function call.

Ah, interesting—I didn't realize that awaiting a spawned task would run it within the context of the current future.

As these are indeed footguns and hard concepts to teach, I would appreciate feedback on how we could improve the docs in this regard. From your perspective, is the explanation above sufficient? Where should these things be explained? The main Hub docs? Maybe a specific section somewhere that talks about async code? This is also not specific to async, thread::spawn concurrency has the same problem, its just a lot more obvious when dealing with async code.

I think the explanation above is great. Adding it to the Hub docs makes sense to me, although I think additionally pointing to that from somewhere more prominent (front page of the docs?) would be helpful too. "If you're writing async or threaded code, see the Hub documentation"? Maybe I should have read the docs more thoroughly to start with, but it took me a long time to figure out that the Hub docs were where I should be looking, because I hadn't actually used Hubs at all manually, since the tracing and tower integrations had been sufficient for everything else I needed to do.

Thanks again!

Swatinem · 2022-10-18T07:50:50Z

Ah, interesting—I didn't realize that awaiting a spawned task would run it within the context of the current future.

Actually, I said the wrong thing here :-D You do have to bind a hub, but you can bind the current hub, instead of a new one.
Hubs are refcounted and have internal mutability anyway, so you can bind the same copy to the spawned future. It does not need to be a new one.

I will try to add some more docs to the hub overview and main page.

Swatinem · 2022-10-18T07:52:21Z

I needed to create a new hub was when processing a stream in parallel:

Well yes, because parallelism / concurrency means the tasks are running independently.

Swatinem · 2022-10-18T14:11:46Z

#509 is a try at improving the docs around this with some examples.
Also while creating the examples, I noticed (yet again) that our Hub APIs are extremely cumbersome to use.
I will try to get more clarity around these things internally and see if we can improve things.

gautamg795 · 2023-02-24T22:20:07Z

Hi @Swatinem, sorry to "hijack" this thread but I think I'm still using bind_hub incorrectly, as I'm seeing unexpected results in the error events on Sentry when a task spawned from an HTTP request fails.

Specifically we encounter this with websocket handlers in Axum — the server itself follows the example to use the sentry-tower layers (e.g. svc.layer(sentry_tower::NewSentryLayer::<_>::new_from_top()).layer(sentry_tower::SentryHttpLayer::new()))

In the websocket handling method, I try to create a new hub and bind it to the future that handles the websocket upgrade:

async fn ws_handler(
    ws: axum::extract::ws::WebSocketUpgrade,
    ...
) -> Result<impl IntoResponse, MyError> {
    let hub = sentry::Hub::new_from_top(sentry::Hub::current());
    Ok(ws.on_upgrade(|ws| handle_websocket(ws, ...).bind_hub(hub)))
}

The first time something errors inside handle_websocket, everything looks great — the request path and headers show up on Sentry as expected. However, errors in subsequent requests have the same metadata — same request path, headers, etc — implying the scope isn't getting properly popped, or the hub is being nested incorrectly (I don't totally understand).

As far as I can tell, this follows the examples you've provided above and in the docs, so any help would be appreciated!

Swatinem · 2023-02-28T11:30:35Z

errors in subsequent requests have the same metadata

I assume that is for every new request that goes through the websocket upgrade?
I haven’t had a chance to try websockets with axum yet though, so no idea how that works in detail. The code you presented does look correct.
How does the rest of your Router and service layer configuration look like?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide examples for async use of `Hub` #507

Provide examples for async use of `Hub` #507

peterstuart commented Oct 13, 2022

Swatinem commented Oct 17, 2022

peterstuart commented Oct 17, 2022

Swatinem commented Oct 18, 2022

Swatinem commented Oct 18, 2022

Swatinem commented Oct 18, 2022

gautamg795 commented Feb 24, 2023

Swatinem commented Feb 28, 2023

Provide examples for async use of Hub #507

Provide examples for async use of Hub #507

Comments

peterstuart commented Oct 13, 2022

Swatinem commented Oct 17, 2022

peterstuart commented Oct 17, 2022

Swatinem commented Oct 18, 2022

Swatinem commented Oct 18, 2022

Swatinem commented Oct 18, 2022

gautamg795 commented Feb 24, 2023

Swatinem commented Feb 28, 2023

Provide examples for async use of `Hub` #507

Provide examples for async use of `Hub` #507