Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make libprotobuf symbols local on OSX #8346

Merged
merged 1 commit into from Apr 24, 2021

Conversation

rowillia
Copy link
Contributor

@gnossen gave a great overview in grpc/grpc#24992 of the overall problem.

If a python process using both protobuf and another native library linking in libprotobuf
frequently can cause crashes. This seems to frequently affect tensorflow as well:

tensorflow/tensorflow#8394,
tensorflow/tensorflow#9525 (comment)
tensorflow/tensorflow#24976,
tensorflow/tensorflow#35573,
https://github.com/tensorflow/tensorflow/blob/v2.0.0/tensorflow/contrib/makefile/rename_protobuf.sh,
tensorflow/tensorflow#16104

This fixes both crashes when linking in multiple versions of protobuf
and fixes DescriptorPool clashes as well (e.g. Python and Native code import different versions of the same message).

@gnossen gave a great overview in grpc/grpc#24992 of the overall problem.

If a python process using both protobuf _and_ another native library linking in libprotobuf
frequently can cause crashes.  This seems to frequently affect tensorflow as well:

tensorflow/tensorflow#8394,
tensorflow/tensorflow#9525 (comment)
tensorflow/tensorflow#24976,
tensorflow/tensorflow#35573,
https://github.com/tensorflow/tensorflow/blob/v2.0.0/tensorflow/contrib/makefile/rename_protobuf.sh,
tensorflow/tensorflow#16104

Testing locally this fixes both crashes when linking in multiple versions of protobuf
and fixes `DescriptorPool` clashes as well (e.g. Python and Native code import different versions of the same message).
@@ -192,6 +192,18 @@ def get_option_from_sys_argv(option_str):

extra_compile_args = []

message_extra_link_args = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/grpc/grpc/pull/24992/files
Could you add a comment like the one in grpc's PR to explain the reason?

@TeBoring
Copy link
Contributor

  1. Could you explain a little more about, before this change, how does mixture of different versions of symbols caused the problem? Seems to me, previously, each versions protobuf symbol was local weak symbols, which is just used in the module locally. If so, seems nothing bad would happen.
  2. Seems your fix makes the local weak symbol defined by protobuf public. How does that fix the problem?
  3. what will happen if another library links protobuf with this fix? And what will happen if another library links protobuf without this fix?

@TeBoring TeBoring self-assigned this Apr 23, 2021
@TeBoring
Copy link
Contributor

TeBoring commented Apr 24, 2021

Reply to my own questions:

  1. https://anadoxin.org/blog/control-over-symbol-exports-in-gcc.html/
    This gives a pretty good explanation. By default, all symbols are exposed. If both B1 and B2 statically link A and expose A's
    symbols, on the other hand, C dynamically links A will get different versions of symbols from B1 and B2, which mess things up.
  2. This fix restrict exported symbol from all to those specified.
  3. Other libraries need to do the same. Otherwise, those other libraries still create conflicts.

@TeBoring TeBoring merged commit 7ad0fd4 into protocolbuffers:master Apr 24, 2021
goaway pushed a commit to envoyproxy/envoy-mobile that referenced this pull request Jun 8, 2021
Description: This PR modifies the build recipe for `envoy_engine.so` to hide all symbols except for `_PyInit_envoy_engine` by default on maOS, so that envoy_engine can be loaded alongside a [protobuf](http://pypi.org/project/protobuf) wheel without breaking.

Loading in protobuf alongside `envoy_engine.so` would register two sets of the same symbols and, on macOS, the runtime linker chooses the first symbol it finds causing problems. See the [similar protobuf change](protocolbuffers/protobuf#8346) and its [sister change in grpc](grpc/grpc#24992) for more information on what's going on.

Risk Level: Low
Testing: See the now-closed #1504 for how I've been testing this.
Docs Changes: N/A
Release Notes: N/A

Signed-off-by: Cerek Hillen <chillen@lyft.com>
@peteboothroyd
Copy link

@TeBoring what happened to these changes? I can't see them in the current branch, nor in the git history for the setup.py file.

@peteboothroyd
Copy link

@TeBoring @rowillia @jtattermusch these changes were removed here, and then some of the changes to setup.py were reverted here. However the changes in this PR were missed. I believe that this is probably still an ongoing issue with the protobuf package, unless some other change has obviated the need for this?

acozzette pushed a commit to acozzette/protobuf that referenced this pull request Jan 22, 2022
@gnossen gave a great overview in grpc/grpc#24992 of the overall problem.

If a python process using both protobuf _and_ another native library linking in libprotobuf
frequently can cause crashes.  This seems to frequently affect tensorflow as well:

tensorflow/tensorflow#8394,
tensorflow/tensorflow#9525 (comment)
tensorflow/tensorflow#24976,
tensorflow/tensorflow#35573,
https://github.com/tensorflow/tensorflow/blob/v2.0.0/tensorflow/contrib/makefile/rename_protobuf.sh,
tensorflow/tensorflow#16104

Testing locally this fixes both crashes when linking in multiple versions of protobuf
and fixes `DescriptorPool` clashes as well (e.g. Python and Native code import different versions of the same message).
acozzette added a commit that referenced this pull request Jan 25, 2022
@gnossen gave a great overview in grpc/grpc#24992 of the overall problem.

If a python process using both protobuf _and_ another native library linking in libprotobuf
frequently can cause crashes.  This seems to frequently affect tensorflow as well:

tensorflow/tensorflow#8394,
tensorflow/tensorflow#9525 (comment)
tensorflow/tensorflow#24976,
tensorflow/tensorflow#35573,
https://github.com/tensorflow/tensorflow/blob/v2.0.0/tensorflow/contrib/makefile/rename_protobuf.sh,
tensorflow/tensorflow#16104

Testing locally this fixes both crashes when linking in multiple versions of protobuf
and fixes `DescriptorPool` clashes as well (e.g. Python and Native code import different versions of the same message).

Co-authored-by: Roy Williams <roy.williams.iii@gmail.com>
jpsim pushed a commit to envoyproxy/envoy that referenced this pull request Nov 28, 2022
Description: This PR modifies the build recipe for `envoy_engine.so` to hide all symbols except for `_PyInit_envoy_engine` by default on maOS, so that envoy_engine can be loaded alongside a [protobuf](http://pypi.org/project/protobuf) wheel without breaking.

Loading in protobuf alongside `envoy_engine.so` would register two sets of the same symbols and, on macOS, the runtime linker chooses the first symbol it finds causing problems. See the [similar protobuf change](protocolbuffers/protobuf#8346) and its [sister change in grpc](grpc/grpc#24992) for more information on what's going on.

Risk Level: Low
Testing: See the now-closed #1504 for how I've been testing this.
Docs Changes: N/A
Release Notes: N/A

Signed-off-by: Cerek Hillen <chillen@lyft.com>
Signed-off-by: JP Simard <jp@jpsim.com>
jpsim pushed a commit to envoyproxy/envoy that referenced this pull request Nov 29, 2022
Description: This PR modifies the build recipe for `envoy_engine.so` to hide all symbols except for `_PyInit_envoy_engine` by default on maOS, so that envoy_engine can be loaded alongside a [protobuf](http://pypi.org/project/protobuf) wheel without breaking.

Loading in protobuf alongside `envoy_engine.so` would register two sets of the same symbols and, on macOS, the runtime linker chooses the first symbol it finds causing problems. See the [similar protobuf change](protocolbuffers/protobuf#8346) and its [sister change in grpc](grpc/grpc#24992) for more information on what's going on.

Risk Level: Low
Testing: See the now-closed #1504 for how I've been testing this.
Docs Changes: N/A
Release Notes: N/A

Signed-off-by: Cerek Hillen <chillen@lyft.com>
Signed-off-by: JP Simard <jp@jpsim.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants