Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lock around strong cache updates #973

Merged
merged 1 commit into from Nov 3, 2019

Conversation

pganssle
Copy link
Member

@pganssle pganssle commented Nov 3, 2019

Summary of changes

Since the strong cache uses operations in OrderedDict that are not thread-safe in Python 2, it is necessary to acquire a lock while updating the dictionary.

When Python 2 support is dropped, we can likely refactor this to avoid locking.

I have a small script that reproduces the issue on Python 2.7, but unfortunately it only reproduces the issue intermittently, and in the current implementation seems to be prone to deadlocks (possibly that is another issue with the tz code..), so it is not really an ideal unit test.

Still, I'll include it here in case anyone wants to verify that this solution works:

from dateutil import tz

import threading
try:
    from threading import Barrier
except ImportError:
    from threading import Semaphore

    class Barrier(object):
        def __init__(self, n):
            self.n = n
            self.count = 0
            self.mutex = Semaphore(1)
            self.barrier = Semaphore(0)

        def wait(self):
            self.mutex.acquire()
            self.count = self.count + 1
            self.mutex.release()
            if self.count == self.n:
                self.barrier.release()
            self.barrier.acquire()
            self.barrier.release()

N = 2500
b = Barrier(N)
b2 = Barrier(N-1)

def constructor():
    b.wait()
    tz.tzstr("EST5EDT")
    b2.wait()

for i in range(N):
    threading.Thread(target=constructor).start()

b2.wait()
print("Done")

Closes #901.

Pull Request Checklist

@pganssle pganssle added this to the 2.8.1 milestone Nov 3, 2019
Since the strong cache uses operations in OrderedDict that are not
thread-safe in Python 2, it is necessary to acquire a lock while
updating the dictionary.

When Python 2 support is dropped, we can likely refactor this to avoid
locking.
@pganssle pganssle merged commit 357c62c into dateutil:master Nov 3, 2019
@jbrockmendel
Copy link
Contributor

How confident are you that this is only needed in py2? it wouldnt be difficult to make the lock py2-only.

@pganssle
Copy link
Member Author

pganssle commented Nov 3, 2019

@jbrockmendel Not terribly confident. It seems like it's from this closed bug in CPython, but I'm not sure if PyPy3 is still affected by it (if they ever were). This is also fixing another race condition that is definitely there in both versions, but only causes unnecessary cache misses as opposed to throwing errors.

If we had a reasonable test for this, I'd probably just change cls.__cache_lock to be contextmanager.nullcontext on Python 3 or something similar, but I don't think it hurts much to lock here, and if we find that it's a terrible performance burden we can cut a new release that is more clever about avoiding locks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Intermittent parse() KeyError failure with 2.8.0
2 participants