Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deps: remove thread_local dependency #749

Merged
merged 4 commits into from
Mar 12, 2021
Merged

deps: remove thread_local dependency #749

merged 4 commits into from
Mar 12, 2021

Conversation

BurntSushi
Copy link
Member

Explanation is in the commit log, and more details are in src/pool.rs in this patch. The short summary is that it seems like thread_local is the underlying cause of leaks in some usage scenarios, so we replace it with a simpler but more constrained memory pool. We also take this opportunity to iron out our UnwindSafe and RefUnwindSafe API commitments. That is, all the various Regex types now impl both traits and this is now tracked in tests.

Fixes #362, Fixes #576

Ref BurntSushi/rure-go#3

The quickcheck update seems to have sussed out a bug in our DFA logic
regarding the encoding of NFA state IDs. But the bug seems unlikely to
occur in real code, so we massage the test data for now until the lazy
DFA gets moved into regex-automata.
It looks like it blows the default regex size limit at the moment.
This removes extraneous commentary that uses the word 'unsafe'. This
makes it easier to grep for usages of meaningful 'unsafe' in the code.
This commit removes the thread_local dependency (even as an optional
dependency) and replaces it with a more purpose driven memory pool. The
comments in src/pool.rs explain this in more detail, but the short story
is that thread_local seems to be at the root of some memory leaks
happening in certain usage scenarios.

The great thing about thread_local though is how fast it is. Using a
simple Mutex<Vec<T>> is easily at least twice as slow. We work around
that a bit by coding a simplistic fast path for the "owner" of a pool.
This does require one new use of `unsafe`, of which we extensively
document.

This now makes the 'perf-cache' feature a no-op. We of course retain it
for compatibility purposes (and perhaps it will be used again in the
future), but for now, we always use the same pool.

As for benchmarks, it is likely that *some* cases will get a hair
slower. But there shouldn't be any dramatic difference. A careful review
of micro-benchmarks in addition to more holistic (albeit ad hoc)
benchmarks via ripgrep seems to confirm this.

Now that we have more explicit control over the memory pool, we also
clean stuff up with repsect to RefUnwindSafe.

Fixes #362, Fixes #576

Ref BurntSushi/rure-go#3
@BurntSushi BurntSushi merged commit e040c1b into master Mar 12, 2021
@BurntSushi BurntSushi deleted the ag/updates branch March 12, 2021 02:10
@BurntSushi
Copy link
Member Author

This PR is on crates.io in regex 1.4.4.

This was referenced Mar 12, 2021
This was referenced Mar 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RegexSet references cannot be passed across unwind boundaries thread local cache may cause memory leaks
1 participant