Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible Deadlock in BehaviorSubject.Subscribe #2080

Open
Joost-Jens-Luminis opened this issue Feb 7, 2024 · 2 comments
Open

Possible Deadlock in BehaviorSubject.Subscribe #2080

Joost-Jens-Luminis opened this issue Feb 7, 2024 · 2 comments

Comments

@Joost-Jens-Luminis
Copy link

Joost-Jens-Luminis commented Feb 7, 2024

Hello and thank you for using dotnet/reactive. Please select a category and detail your issue by answering the questions there:

Bug

We recently had a deadlock in our application, when while the Subscribe was active, the UnSubscribe (via the Dispose) was called on a differen thread and this caused a deadlock. This happened in our case as the subscription was cancelled before the Subscribe was even called (due to a quickly opening and closing of a screen).

We found the issue to be in the following file: https://github.com/dotnet/reactive/blob/main/Rx.NET/Source/src/System.Reactive/Subjects/BehaviorSubject.cs

I suggest to change the following code from

    public override IDisposable Subscribe(IObserver<T> observer)
        {
            if (observer == null)
            {
                throw new ArgumentNullException(nameof(observer));
            }

            Exception? ex;

            lock (_gate)
            {
                CheckDisposed();

                if (!_isStopped)
                {
                    _observers = _observers.Add(observer);
                    observer.OnNext(_value);
                    return new Subscription(this, observer);
                }

                ex = _exception;
            }

            if (ex != null)
            {
                observer.OnError(ex);
            }
            else
            {
                observer.OnCompleted();
            }

            return Disposable.Empty;
        }

To:

    public override IDisposable Subscribe(IObserver<T> observer)
    {
        if (observer == null)
        {
            throw new ArgumentNullException(nameof(observer));
        }

        Exception? ex;

        lock (this._gate)
        {
            this.CheckDisposed();

            this._observers = this._observers.Add(observer);
            ex = this._exception;
        }

        // Note: This first IF statement with the observer.OnNext was originally inside the lock
        // This could cause a deadlock in certain circumstances. As part of the fix we have made a copy of the 
        // BehaviorSubject from Reactive.Net and moved this outside the lock to prevent a deadlock
        // when a subject is being disposed while being inside the Subscribe.
        // See also: https://dev.azure.com/bronkhorst/IT/_wiki/wikis/Software-development.wiki/627/043-deadlock-terug-navigeren-Android
        if (!this._isStopped)
        {
            observer.OnNext(this._value);
            return new Subscription(this, observer);
        }
        else if (ex != null)
        {
            observer.OnError(ex);
        }
        else
        {
            observer.OnCompleted();
        }

        return Disposable.Empty;
    }

The main change is that the IObserver.OnNext call is outside the lock. This also matches what happens when BehaviourSubject.OnNext from the is called, where the IObserver.OnNext is also outside the lock. This will prevent future deadlock scenarios.

Which library version?
6.0.0
What are the platform(s), environment(s) and related component version(s)?
Android/IOS/Windows (probably all platforms, but verified on these)

What is the use case or problem?
A deadlock can occur.

What is the expected outcome?
No deadlock possibility.

What is the actual outcome?
In our case a deadlock.

What is the stacktrace of the exception(s) if any?
There is no exception.

Do you have a code snippet or project that reproduces the problem?

@akarnokd
Copy link
Collaborator

akarnokd commented Feb 7, 2024

BehaviorSubject is very special in that it promises to emit the stored value to the incoming observer before it relays future values. The trouble is, a concurrent BehaviorSubject::OnNext call can happen while a BehaviourSubject::Subscribe is ongoing. OnNext mustn't be called concurrently so the two method calls must be serialized, hence the emission under the lock.

This happened in our case as the subscription was cancelled before the Subscribe was even called

The IDisposable returned by the BehaviorSubject::Subscribe can't be disposed upfront as it does not exist until the method returns it. There must have been something else causing the deadlock. Could you please share more details about the relevant code of yours?

@Joost-Jens-Luminis
Copy link
Author

Thank you for your quick reply, it's much appreciated.

Thinking about it, I agree that the Dispose couldn't cause the deadlock, except for when a subscribe causes a dispose of a previous subscription (but why should it?). I must have been confused as several dispose calls were waiting to get the lock and in the stacktrace that was always as a result of an Subscribe call. This was on Android though which may have caused issues in the stacktrace.

To be honest, there were a bunch of issues happening at the same time divided over several threads (up to 15 according to the parallel stacks window) which caused problems, making it difficult to separate which calls exactly caused the deadlock. (It took me a week and half to figure it all out)
I do know that the dispose was involved and making this change solved the deadlock for us. :)

As for what emits when; if the promise that the stored value is first emitted before any OnNext calls is sacred, then I agree my suggested change can't be made, even though it is unlikely an incoming OnNext call is emiting before the Subscribe completes. But if you believe it can happen (and I do think it can in theory), then by extension, there is no guarantee that all OnNext calls are emitted in the order they are received, which sort of makes the promise a bit of a weird one to make (what is the point of it).

I just wanted to put my issue and solution out here, as for us it works (and it may help other people). I do think it's clear a deadlock "could" happen there. But if my solution doesn't fit in the context of what the BehaviorSubject promises (or should promise) then feel free to close this issue.

Once again, I appreciate your quick response. Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants