Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: fix missed register_callsite error #2938

Draft
wants to merge 1 commit into
base: v0.1.x
Choose a base branch
from

Commits on Apr 16, 2024

  1. core: fix missed register_callsite error

    There are 2 triggers which will cause a subscriber to receive a call to
    `Subscriber::register_callsite` for a specific callsite.
    1. The first time the event or span at that callsite is executed.
    2. When a new subscriber is added or removed (for example, calls to
       `set_default` or `with_default`)
    
    It is trigger (2) that will cause a new subscriber to receive
    `Subscriber::register_callsite` for all the callsites which had already
    been registered before it became active.
    
    When a callsite is registered for trigger (1), the callsite starts in
    state `UNREGISTERED`.
    
    The first thread to encounter the callsite will transition it to
    `REGISTERING` and determine the overall interest for the callsite by
    registering with all known dispatchers (which will call into
    `Subscriber::register_callsite`).
    
    Once that is complete, the callsite is added to the list of all known
    callsites and its state is transitioned to `REGISTERED`.
    
    is (re)built for all known dispatchers. The callsite starts in state
    `UNREGISTERED`. The  This calls down into
    `Subscriber::register_callsite` for each subscriber. Once that is
    complete, the callsite is added to the global list of known callsites.
    
    While the callsite interest is being rebuilt, other threads that
    encounter the callsite will be given `Interest::sometimes()` until the
    registration is complete. However, if a new subscriber is added during
    this window, all the interest for all callsites will be rebuilt, but
    because the new callsite (in state `REGISTERING`) won't be included
    because it isn't yet in the global list of callsites.
    
    This can cause a case where that new subscriber being added won't
    receive `Subscriber::register_callsite` before it receives the subsequent
    call to `Subscriber::event` or `Subscriber::new_span`.
    
    The documentation on [Registering Callsites] is not very explicit on
    this point, but it does suggest that `Subscriber::register_callsite`
    will be called before the call to either `Subscriber::event` or
    `Subscriber::new_span`, and the current behavior can break this implicit
    contract.
    
    [Registering Callsites]: https://docs.rs/tracing-core/0.1.32/tracing_core/callsite/index.html#registering-callsites
    
    This change swaps the order of rebuilding the callsite interest and
    adding the callsite to the global list so that the callsite gets pushed
    first, avoiding this window in which a subscriber won't get a call to
    `register_callsite`.
    
    As such, a callsite may have its interest read before it is set. In this
    case, the existing implementation will return `Interest::sometimes()`
    for the `DefaultCallsite` implementation. Other implementations (outside
    of the `tracing` project) may perform this differently, but in this
    case, there is no documented guarantee regarding the ordering.
    
    A regression test is included which provokes the race condition 100% of
    the time before the changes in this fix.
    
    Fixes: #2743
    hds committed Apr 16, 2024
    Configuration menu
    Copy the full SHA
    51d7d9e View commit details
    Browse the repository at this point in the history