Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grpc client broken due to segment fault when call grpc::Channel::~Channel #14248

Closed
ghost opened this issue Jan 31, 2018 · 2 comments
Closed
Assignees
Labels

Comments

@ghost
Copy link

ghost commented Jan 31, 2018

Please answer these questions before submitting your issue.

Should this be an issue in the gRPC issue tracker?

yes

Create new issues for bugs and feature requests. An issue needs to be actionable. General gRPC discussions and usage questions belong to:

Please don't double post your questions in more locations; we are monitoring both channels, and the time spent de-duplicating questions is better spent answering more user questions.

What version of gRPC and what language are you using?

c++ with tensorflow v1.5

What operating system (Linux, Windows, …) and version?

linuxq

What runtime / compiler are you using (e.g. python version or version of gcc)

gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)

What did you do?

If possible, provide a recipe for reproducing the error. Try being specific and include code snippets if helpful.
se grpc as serving client

What did you expect to see?

dont core

What did you see instead?

process core with dump file

Make sure you include information that can help us debug (full error message, exception listing, stack trace, logs).
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./bin/rf-dl-server conf/rf-dl-server.conf'.
Program terminated with signal 11, Segmentation fault.
#0 0x00007f9f73f1366e in grpc_client_channel_stop_backup_polling(grpc_exec_ctx*, grpc_pollset_set*) () from /data/panjize/rf-dl-od_time_start2_limit_len_r15/lib/libinterface-15.so
Missing separate debuginfos, use: debuginfo-install glibc-2.17-157.el7_3.5.x86_64 libgcc-4.8.5-11.el7.x86_64 openssl-libs-1.0.1e-60.el7_3.1.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0 0x00007f9f73f1366e in grpc_client_channel_stop_backup_polling(grpc_exec_ctx*, grpc_pollset_set*) () from /data/panjize/rf-dl-od_time_start2_limit_len_r15/lib/libinterface-15.so
#1 0x00007f9f73f168f4 in cc_destroy_channel_elem(grpc_exec_ctx*, grpc_channel_element*) () from /data/panjize/rf-dl-od_time_start2_limit_len_r15/lib/libinterface-15.so
#2 0x00007f9f73f1f614 in grpc_channel_stack_destroy () from /data/panjize/rf-dl-od_time_start2_limit_len_r15/lib/libinterface-15.so
#3 0x00007f9f73f4d3dc in destroy_channel(grpc_exec_ctx*, void*, grpc_error*) () from /data/panjize/rf-dl-od_time_start2_limit_len_r15/lib/libinterface-15.so
#4 0x00007f9f73f34167 in grpc_exec_ctx_flush () from /data/panjize/rf-dl-od_time_start2_limit_len_r15/lib/libinterface-15.so
#5 0x00007f9f73f4e889 in grpc_channel_destroy () from /data/panjize/rf-dl-od_time_start2_limit_len_r15/lib/libinterface-15.so
#6 0x00007f9f73ecbf07 in grpc::Channel::~Channel() () from /data/panjize/rf-dl-od_time_start2_limit_len_r15/lib/libinterface-15.so
#7 0x00007f9f73ecc051 in grpc::Channel::~Channel() () from /data/panjize/rf-dl-od_time_start2_limit_len_r15/lib/libinterface-15.so
#8 0x00007f9f73eb232a in TFSFreeDataClient(TFSDataClient*) () from /data/panjize/rf-dl-od_time_start2_limit_len_r15/lib/libinterface-15.so
#9 0x0000000000502a34 in its::rh::FeatureDL::CalcRouteHotBatch (this=this@entry=0x7f9f61e768e8, city_id=, req_data=..., result=..., mask=..., time_serving=@0x7f9f61e776a4: 14936,
padding_max_primitive=@0x7f9f61e776a8: 62, padding_max_process=@0x7f9f61e776ac: 62) at /home/xiaoju/user/panjize/auto-build/data/modules/route-feature-dl/src/core/rh_feature_dl.cpp:95
#10 0x00000000004ed10c in its::rh::ServiceHandler::Calc_RouteHotBatch (this=this@entry=0x2c7ab80, feature_name=..., req_data=..., adaptor_struct=..., resp_data=..., error_msg=..., map_version=...,
time_serving=@0x7f9f61e776a4: 14936, padding_max_primitive=@0x7f9f61e776a8: 62, padding_max_process=@0x7f9f61e776ac: 62)
at /home/xiaoju/user/panjize/auto-build/data/modules/route-feature-dl/src/service/rh_handler.cpp:275
#11 0x00000000004ee3f7 in its::rh::ServiceHandler::BatchRouteHot (this=this@entry=0x2c7ab80, resp=..., req=..., str=..., time_serving=@0x7f9f61e776a4: 14936, time_serving@entry=@0x7f9f61e776a4: 0,
padding_max_primitive=@0x7f9f61e776a8: 62, padding_max_primitive@entry=@0x7f9f61e776a8: 0, padding_max_process=@0x7f9f61e776ac: 62)
at /home/xiaoju/user/panjize/auto-build/data/modules/route-feature-dl/src/service/rh_handler.cpp:174

Anything else we should know about your project / environment?

while grpc in tensorflow serving 1.4 is ok, but when we upgrade to tensorflow serving 1.5, with the same client code, the process will core when we free grpc client.

@y-zeng
Copy link
Contributor

y-zeng commented Feb 6, 2018

@theckwolf It'd be great if we can have more information about this issue.

@yashykt
Copy link
Member

yashykt commented May 15, 2018

No response has been received in more than 3 months. Closing for now.

@yashykt yashykt closed this as completed May 15, 2018
@lock lock bot locked as resolved and limited conversation to collaborators Sep 29, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants