Skip to content

stackoverflow in 1.4.4 #750

Closed
Closed
@BusyJay

Description

@BusyJay

What version of regex are you using?

If it isn't the latest version, then please upgrade and check whether the bug
is still present.

Describe the bug at a high level.

1.4.4 breaks windows build of grpcio.

What are the steps to reproduce the behavior?

Just run cargo test --all on Windows.

What is the actual behavior?

Build fails when generating bindings using rust-bindgen, which uses regex to process sources.

When downgrade regex back to before e040c1b, the build can finish successfully, so the bug is probably introduced by #749.

What is the expected behavior?

It compiles successfully.

What do you expect the output to be?

Activity

BurntSushi

BurntSushi commented on Mar 13, 2021

@BurntSushi
Member

I need a smaller reproduction please. There is no obvious reason that I see why any recent changes would cause a stack overflow. It also isn't clear whether you're reporting a compilation error or something that happens at regex runtime. If the former, that then sounds like a rustc bug, no? If the latter, a stack trace would be helpful.

BusyJay

BusyJay commented on Mar 13, 2021

@BusyJay
Author

I'm not familiar with windows platform enough to provide a stacetrace, sorry. It's compilation error because bindgen fails to generate bindings. The reason that it fails to generate bindings because of stackoverflow. rust-bindgen is the only one dependency of grpcio-sys that uses regex. I guess there is some code in rust-bindgen that invokes regex heavily and result in stackoverflow.

I can confirm wrapping T into Box get around the issue, although I'm not sure if it's the correct fix.

owner_val: T,

Before the PR, all values are allocated at heap when not enables perf-cache.

BurntSushi

BurntSushi commented on Mar 13, 2021

@BurntSushi
Member

I'm not familiar enough with Windows either.

Your patch that fixes the issue is quite weird. I wonder if this is not a case of recursion causing a stack overflow, but rather, too many things on the stack itself. If Windows has smaller stack sizes than, let's say, Linux or macOS, it could explain why Windows specifically is having a problem. But for this to be true, I think you would need to put a lot of regexes on the stack.

BusyJay

BusyJay commented on Mar 13, 2021

@BusyJay
Author

According to the rustc output, the new Pool size becomes 848 byte, is it an expected size? The default stack size of a thread in Rust is 2MiB. Supposing half of the stack is used for other stack frames, then users are expected not to create more than 2473 regex expressions.

% env RUSTFLAGS="-Z print-type-sizes=y" cargo +nightly test --all | grep regex::pool::Pool
...
print-type-size type: `regex::pool::Pool<std::panic::AssertUnwindSafe<std::cell::RefCell<regex::exec::ProgramCacheInner>>>`: 848 bytes, alignment: 8 bytes

...why Windows specifically is having a problem

I think this is a common problems for all platform. The reason why it always stackoverflow on Windows may be related to different symbols on different platforms. Perhaps rust-bindgen just needs to handle more symbols than other platform.

BusyJay

BusyJay commented on Mar 13, 2021

@BusyJay
Author

Windows has smaller stack sizes

Actually it's true. I got the 2MiB from https://doc.rust-lang.org/std/thread/index.html#stack-size, but I missed the bottom line

Note that the stack size of the main thread is not determined by Rust.

After referring to the docs and writing small snippets to verify it, the stack sizes of main thread on MacOS and Linux are the same as ulimit -s, which is usually 8MiB. And on Windows it is 1MiB.

BurntSushi

BurntSushi commented on Mar 14, 2021

@BurntSushi
Member

@jdm One thing that would be useful is a pointer to the source code where bindgen uses regexes. If there are a lot of them on the stack, then I think that would explain things here.

BurntSushi

BurntSushi commented on Mar 14, 2021

@BurntSushi
Member

servo/servo#28269 is the tracking bug in servo for this, as they are hitting it too.

BurntSushi

BurntSushi commented on Mar 14, 2021

@BurntSushi
Member

I'm working on a patch for this now.

added a commit that references this issue on Mar 14, 2021
4c9ba9d
jdm

jdm commented on Mar 14, 2021

@jdm
added a commit that references this issue on Mar 14, 2021
081d430
BurntSushi

BurntSushi commented on Mar 14, 2021

@BurntSushi
Member

@jdm OMG. That's so many regexes! Hahahaha. That has to be it.

I opened #752 that shrinks the size of Regex from 856 bytes to 16. Lol. It actually used to be 552 bytes before 1.4.4, so it was never particularly small. So it sounds like it must have crossed a stack size threshold somewhere.

132 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @jdm@BurntSushi@BusyJay

      Issue actions

        stackoverflow in 1.4.4 · Issue #750 · rust-lang/regex