Td-agent issue on arm architecture based Centos8 machines #3689
aditya2301
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We are facing issues with td-agent service in a centos8 arm-based machine:
[centos@ip-10-0-3-33 ~]$ sudo systemctl restart td-agent
Job for td-agent.service failed because a fatal signal was delivered causing the control process to dump core.
See "systemctl status td-agent.service" and "journalctl -xe" for details.
[centos@ip-10-0-3-33 ~]$ sudo systemctl status td-agent
● td-agent.service - td-agent: Fluentd based data collector for Treasure Data
Loaded: loaded (/usr/lib/systemd/system/td-agent.service; enabled; vendor preset: disabled)
Active: deactivating (stop-sigterm) (Result: core-dump) since Tue 2022-03-22 12:53:45 UTC; 6min ago
Docs: https://docs.treasuredata.com/display/public/PD/About+Treasure+Data%27s+Server-Side+Agent
Process: 2664 ExecStop=/bin/kill -TERM ${MAINPID} (code=exited, status=0/SUCCESS)
Process: 2844 ExecStart=/opt/td-agent/bin/fluentd --log $TD_AGENT_LOG_FILE --daemon /var/run/td-agent/td-agent.pid $TD_AGENT_OPTIONS (code=dumped, signal=SEGV)
Main PID: 1795 (code=killed, signal=KILL)
Tasks: 15 (limit: 5464)
Memory: 163.5M
CGroup: /system.slice/td-agent.service
├─2850 /opt/td-agent/bin/ruby /opt/td-agent/bin/fluentd --log /var/log/td-agent/td-agent.log --daemon /var/run/td-agent/td-agent.pid
└─2858 /opt/td-agent/bin/ruby -Eascii-8bit:ascii-8bit /opt/td-agent/bin/fluentd --log /var/log/td-agent/td-agent.log --daemon /var/run/td-agent/td-agent.pid --under-supervisor
Mar 22 12:57:55 ip-10-0-3-33.ap-southeast-1.compute.internal fluentd[2844]: 589 /opt/td-agent/lib/ruby/gems/2.7.0/gems/async-io-1.32.2/lib/async/io/ssl_endpoint.rb
Mar 22 12:57:55 ip-10-0-3-33.ap-southeast-1.compute.internal fluentd[2844]: 590 /opt/td-agent/lib/ruby/gems/2.7.0/gems/async-http-0.56.5/lib/async/http/endpoint.rb
Mar 22 12:57:55 ip-10-0-3-33.ap-southeast-1.compute.internal fluentd[2844]: 591 /opt/td-agent/lib/ruby/gems/2.7.0/gems/async-http-0.56.5/lib/async/http.rb
Mar 22 12:57:55 ip-10-0-3-33.ap-southeast-1.compute.internal fluentd[2844]: 592 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.14.3/lib/fluent/plugin_helper/http_server/methods.rb
Mar 22 12:57:55 ip-10-0-3-33.ap-southeast-1.compute.internal fluentd[2844]: 593 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.14.3/lib/fluent/plugin_helper/http_server/request.rb
Mar 22 12:57:55 ip-10-0-3-33.ap-southeast-1.compute.internal fluentd[2844]: 594 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.14.3/lib/fluent/plugin_helper/http_server/app.rb
Mar 22 12:57:55 ip-10-0-3-33.ap-southeast-1.compute.internal fluentd[2844]: 595 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.14.3/lib/fluent/plugin_helper/http_server/router.rb
Mar 22 12:57:55 ip-10-0-3-33.ap-southeast-1.compute.internal fluentd[2844]: 596 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.14.3/lib/fluent/plugin_helper/http_server/server.rb
Mar 22 12:57:55 ip-10-0-3-33.ap-southeast-1.compute.internal fluentd[2844]: 597 /opt/td-agent/lib/ruby/gems/2.7.0/gems/sigdump-0.2.4/lib/sigdump.rb
Mar 22 12:57:56 ip-10-0-3-33.ap-southeast-1.compute.internal systemd[1]: td-agent.service: Control process exited, code=dumped status=11 <<<<<<<<<<<<<<<<<<<<<<<
Snip from messages log file:
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 566 /opt/td-agent/lib/ruby/2.7.0/aarch64-linux/digest.so
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 567 /opt/td-agent/lib/ruby/2.7.0/digest.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 568 /opt/td-agent/lib/ruby/2.7.0/aarch64-linux/openssl.so
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 569 /opt/td-agent/lib/ruby/2.7.0/openssl/bn.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 570 /opt/td-agent/lib/ruby/2.7.0/openssl/pkey.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 571 /opt/td-agent/lib/ruby/2.7.0/openssl/cipher.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 572 /opt/td-agent/lib/ruby/2.7.0/openssl/config.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 573 /opt/td-agent/lib/ruby/2.7.0/openssl/digest.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 574 /opt/td-agent/lib/ruby/2.7.0/openssl/x509.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 575 /opt/td-agent/lib/ruby/2.7.0/openssl/buffering.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 576 /opt/td-agent/lib/ruby/2.7.0/aarch64-linux/io/nonblock.so
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 577 /opt/td-agent/lib/ruby/2.7.0/ipaddr.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 578 /opt/td-agent/lib/ruby/2.7.0/openssl/ssl.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 579 /opt/td-agent/lib/ruby/2.7.0/openssl/pkcs5.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 580 /opt/td-agent/lib/ruby/2.7.0/openssl.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 581 /opt/td-agent/lib/ruby/gems/2.7.0/gems/async-http-0.56.5/lib/async/http/protocol/https.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 582 /opt/td-agent/lib/ruby/gems/2.7.0/gems/async-http-0.56.5/lib/async/http/protocol.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 583 /opt/td-agent/lib/ruby/gems/2.7.0/gems/async-http-0.56.5/lib/async/http/client.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 584 /opt/td-agent/lib/ruby/gems/2.7.0/gems/protocol-http-0.22.5/lib/protocol/http/middleware.rb
Mar 22 13:00:03 ip-10-0-3-33 fluentd[2961]: 585 /opt/td-agent/lib/ruby/gems/2.7.0/gems/async-http-0.56.5/lib/async/http/server.rb
Mar 22 13:00:03 ip-10-0-3-33 systemd[1]: Started Process Core Dump (PID 2971/UID 0).
Mar 22 13:00:03 ip-10-0-3-33 systemd-coredump[2972]: Process 2961 (fluentd) of user 989 dumped core.#12#012Stack trace of thread 2961:#12#0 0x0000ffffa2a671ac classname (libruby.so.2.7)#12#1 0x0000ffffa2a67f54 rb_tmp_class_path (libruby.so.2.7)#12#2 0x0000ffffa2a68a90 rb_search_class_path (libruby.so.2.7)#12#3 0x0000ffffa2a91770 rb_vm_bugreport (libruby.so.2.7)#12#4 0x0000ffffa2922aa4 rb_bug_for_fatal_signal (libruby.so.2.7)#12#5 0x0000ffffa2a18ddc sigsegv (libruby.so.2.7)#12#6 0x0000ffffa2c507a0 n/a (linux-vdso.so.1)#12#7 0x0000ffffa2c507a0 n/a (linux-vdso.so.1)#12#8 0x0000000000000ff3 n/a (n/a)#12#9 0x0000ffffa2bfc628 je_tcache_bin_flush_large (libjemalloc.so)#12#10 0x00000000000003e0 n/a (n/a)
Mar 22 13:00:03 ip-10-0-3-33 systemd[1]: td-agent.service: Control process exited, code=dumped status=11
Mar 22 13:00:03 ip-10-0-3-33 systemd[1]: systemd-coredump@29-2971-0.service: Succeeded.
Mar 22 13:00:03 ip-10-0-3-33 systemd[1]: Started Process Core Dump (PID 2980/UID 0).
Mar 22 13:00:04 ip-10-0-3-33 systemd-coredump[2981]: Process 2967 (fluentd) of user 989 dumped core.#12#012Stack trace of thread 2967:#12#0 0x0000ffffa2965fc4 rb_iseq_path (libruby.so.2.7)#12#1 0x0000ffffa2a7b5b4 rb_source_location_cstr (libruby.so.2.7)#12#2 0x0000ffffa2922a54 rb_bug_for_fatal_signal (libruby.so.2.7)#12#3 0x0000ffffa2a18ddc sigsegv (libruby.so.2.7)#12#4 0x0000ffffa2c507a0 n/a (linux-vdso.so.1)#12#5 0x0000ffffa2c507a0 n/a (linux-vdso.so.1)#12#6 0x0000000000000ff3 n/a (n/a)#12#7 0x0000ffffa2bfc56c je_tcache_bin_flush_large (libjemalloc.so)#12#8 0x00000000000003e0 n/a (n/a)
Mar 22 13:00:04 ip-10-0-3-33 systemd[1]: td-agent.service: Failed with result 'core-dump'.
Mar 22 13:00:04 ip-10-0-3-33 systemd[1]: Failed to start td-agent: Fluentd based data collector for Treasure Data.
Mar 22 13:00:04 ip-10-0-3-33 systemd[1]: systemd-coredump@30-2980-0.service: Succeeded.
Mar 22 13:00:04 ip-10-0-3-33 systemd[1]: td-agent.service: Service RestartSec=100ms expired, scheduling restart.
Mar 22 13:00:04 ip-10-0-3-33 systemd[1]: td-agent.service: Scheduled restart job, restart counter is at 7.
Mar 22 13:00:04 ip-10-0-3-33 systemd[1]: Stopped td-agent: Fluentd based data collector for Treasure Data.
Mar 22 13:00:04 ip-10-0-3-33 systemd[1]: Starting td-agent: Fluentd based data collector for Treasure Data...
Mar 22 13:00:05 ip-10-0-3-33 fluentd[2988]: /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.14.3/lib/fluent/config/types.rb:87: warning: regular expression has ']' without escape
Mar 22 13:00:05 ip-10-0-3-33 systemd[1]: Started td-agent: Fluentd based data collector for Treasure Data.
As of now, we are tailing nginx- access and error logs and a custom request_response log on the instances. The mentioned issues however were not there with x86 architecture machines, where we have td-agent running fine with similar configuration which makes us worry about some issue with td-agent aarch64 packages.
The package currently installed is latest as shown below:
[root@ip-10-0-3-33 coredump]# sudo yum list td-agent
Last metadata expiration check: 2:32:31 ago on Tue 22 Mar 2022 09:42:51 AM UTC.
Installed Packages
td-agent.aarch64 4.3.0-1.el8 @treasuredata
Kindly help in sorting out this issue.
Beta Was this translation helpful? Give feedback.
All reactions