Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[error]: #0 failed to flush the buffer, and hit limit for retries. dropping all chunks in the buffer queue. retry_times=3 records=2 error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster :end of file reached (EOFError)" #770

Closed
1 task
himanshigpta opened this issue Jun 28, 2020 · 26 comments
Labels

Comments

@himanshigpta
Copy link

himanshigpta commented Jun 28, 2020

(check apply)

Problem

I'm using 'td-agent 1.6.3' to send logs to Elasticsearch 7.5.1, it works fine for some time & stops sending logs after that. Restarting the agent starts sending the logs again. I'm facing this issue on multiple servers where td-agent is installed.

Steps to replicate

Provide example config and message

<source>
 @type tail
  path /data/fta_service_request_%Y%m%d.dat
  pos_file /etc/td-agent/fta_new_logs.log.pos
  tag new_fta_logs
  #read_from_head true
  <parse>
    @type multiline_grok
    #multiline_start_regexp /\d{4}-\d{1,2}-\d{1,2}/
     <grok>
      pattern %{DATA:ud}\|%{DATA:number}\|%{INT:transaction_date}
     </grok>
     <grok>
     pattern %{GREEDYDATA:message}
     </grok>
  </parse>
</source>

<source>
 @type tail
  path /data/main/fta__service_response_%Y%m%d.dat
  pos_file /etc/td-agent/fta_new_logs_res.log.pos
  tag new_fta_logs_res
  #read_from_head true
  <parse>
    @type multiline_grok
    #multiline_start_regexp /\d{4}-\d{1,2}-\d{1,2}/
     <grok>
      pattern %{DATA:keyword}\|%{DATA:message}\|%{DATA:msi}\|%{DATA:short_code}\|%{INT:transaction_datetime}
     </grok>
     <grok>
     pattern %{GREEDYDATA:message}
     </grok>
  </parse>
</source>

<filter new_fta_logs*>
    @type record_modifier
     <record>
     hostname "#{Socket.gethostname}"
     formatted_time ${Time.at(time).iso8601(3)}
     </record>
     char_encoding utf-8
     char_encoding utf-8:euc-jp
</filter>

<match new_fta_logs*>
  @type elasticsearch
  log_es_400_reason true
  user <redacted>
  password <redacted>
  type_name "_doc"
  ssl_version TLSv1_2
  ca_file "/etc/path_to/file.crt"
  hosts <redacted>
  scheme "https"
  logstash_format true
  logstash_dateformat %V
  logstash_prefix fta_logs
  include_timestamp true
  <buffer>
    @type file
    path /etc/td-agent/buffers_fta_new
    chunk_limit_size 1M
    flush_interval 5s
    retry_forever false
    retry_max_times 3
    retry_wait 10
    retry_max_interval 300
    flush_thread_count 8
    reconnect_on_error true
    reload_on_failure true
    reload_connections false
#       request_timeout 2147483648
  </buffer>
</match>

Expected Behavior or What you need to ask

It shouldn't stop sending the logs.

Using Fluentd and ES plugin versions

td-agent --version

td-agent 1.6.3

cat /etc/redhat-release

Red Hat Enterprise Linux Server release 7.2 (Maipo)

uname -r

3.10.0-327.62.1.el7.x86_64

  • ES plugin 3.x.y/2.x.y or 1.x.y
    • paste boot log of fluentd or td-agent
      reload_on_failure true
      reload_connections false
    </buffer>
  </match>
</ROOT>
2020-06-28 11:17:00 +0530 [info]: starting fluentd-1.6.3 pid=26802 ruby="2.4.1"
2020-06-28 11:17:00 +0530 [info]: spawn command to main:  cmdline=["/opt/td-agent/embedded/bin/ruby", "-Eascii-8bit:ascii-8bit", "/usr/sbin/td-agent", "--under-supervisor"]
2020-06-28 11:17:00 +0530 [info]: gem 'fluent-plugin-concat' version '2.4.0'
2020-06-28 11:17:00 +0530 [info]: gem 'fluent-plugin-elasticsearch' version '3.5.2'
2020-06-28 11:17:00 +0530 [info]: gem 'fluent-plugin-elasticsearch' version '1.9.5'
2020-06-28 11:17:00 +0530 [info]: gem 'fluent-plugin-elasticsearch' version '1.4.0'
2020-06-28 11:17:00 +0530 [info]: gem 'fluent-plugin-grok-parser' version '2.1.4'
2020-06-28 11:17:00 +0530 [info]: gem 'fluent-plugin-kafka' version '0.5.5'
2020-06-28 11:17:00 +0530 [info]: gem 'fluent-plugin-record-modifier' version '0.6.0'
2020-06-28 11:17:00 +0530 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '1.5.5'
2020-06-28 11:17:00 +0530 [info]: gem 'fluent-plugin-s3' version '1.0.0.rc3'
2020-06-28 11:17:00 +0530 [info]: gem 'fluent-plugin-td' version '1.0.0.rc1'
2020-06-28 11:17:00 +0530 [info]: gem 'fluent-plugin-td-monitoring' version '0.2.2'
2020-06-28 11:17:00 +0530 [info]: gem 'fluent-plugin-webhdfs' version '1.1.1'
2020-06-28 11:17:00 +0530 [info]: gem 'fluentd' version '1.6.3'
2020-06-28 11:17:00 +0530 [info]: gem 'fluentd' version '0.14.16'
2020-06-28 11:17:00 +0530 [info]: adding filter pattern="new_fta_logs*" type="record_modifier"
2020-06-28 11:17:00 +0530 [info]: adding match pattern="new_fta_logs*" type="elasticsearch"
2020-06-28 11:17:00 +0530 [error]: #0 unexpected error error_class=IPAddr::InvalidAddressError error="invalid address"
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/ipaddr.rb:563:in `in6_addr'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/ipaddr.rb:500:in `initialize'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/ipaddr.rb:518:in `new'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/ipaddr.rb:518:in `coerce_other'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/ipaddr.rb:174:in `include?'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/uri/generic.rb:1541:in `block in find_proxy'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/uri/generic.rb:1530:in `scan'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/uri/generic.rb:1530:in `find_proxy'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/faraday-0.12.1/lib/faraday/connection.rb:88:in `block in initialize'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/faraday-0.12.1/lib/faraday/options.rb:75:in `fetch'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/faraday-0.12.1/lib/faraday/connection.rb:83:in `initialize'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/http/faraday.rb:38:in `new'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/http/faraday.rb:38:in `__build_connection'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:138:in `block in __build_connections'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:130:in `map'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:130:in `__build_connections'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:40:in `initialize'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-elasticsearch-3.5.2/lib/fluent/plugin/out_elasticsearch.rb:421:in `new'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-elasticsearch-3.5.2/lib/fluent/plugin/out_elasticsearch.rb:421:in `client'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-elasticsearch-3.5.2/lib/fluent/plugin/elasticsearch_index_template.rb:36:in `rescue in retry_operate'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-elasticsearch-3.5.2/lib/fluent/plugin/elasticsearch_index_template.rb:34:in `retry_operate'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-elasticsearch-3.5.2/lib/fluent/plugin/out_elasticsearch.rb:244:in `configure'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.6.3/lib/fluent/plugin.rb:164:in `configure'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.6.3/lib/fluent/agent.rb:130:in `add_match'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.6.3/lib/fluent/agent.rb:72:in `block in configure'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.6.3/lib/fluent/agent.rb:64:in `each'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.6.3/lib/fluent/agent.rb:64:in `configure'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.6.3/lib/fluent/root_agent.rb:150:in `configure'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.6.3/lib/fluent/engine.rb:131:in `configure'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.6.3/lib/fluent/engine.rb:96:in `run_configure'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.6.3/lib/fluent/supervisor.rb:804:in `run_configure'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.6.3/lib/fluent/supervisor.rb:550:in `block in run_worker'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.6.3/lib/fluent/supervisor.rb:733:in `main_process'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.6.3/lib/fluent/supervisor.rb:546:in `run_worker'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.6.3/lib/fluent/command/fluentd.rb:320:in `<top (required)>'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.6.3/bin/fluentd:8:in `<top (required)>'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/bin/fluentd:22:in `load'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/bin/fluentd:22:in `<top (required)>'
  2020-06-28 11:17:00 +0530 [error]: #0 /usr/sbin/td-agent:7:in `load'
  2020-06-28 11:17:00 +0530 [error]: #0 /usr/sbin/td-agent:7:in `<main>'
  • paste result of fluent-gem list, td-agent-gem list or your Gemfile.lock
# td-agent-gem list

*** LOCAL GEMS ***

addressable (2.5.1)
aws-sdk (2.9.19)
aws-sdk-core (2.9.19)
aws-sdk-resources (2.9.19)
aws-sigv4 (1.0.0)
bigdecimal (default: 1.3.0)
bundler (1.14.5)
bzip2-ffi (1.0.0)
cool.io (1.5.0)
did_you_mean (1.1.0)
dig_rb (1.0.1)
elasticsearch (5.0.4)
elasticsearch-api (5.0.4)
elasticsearch-transport (5.0.4)
excon (0.55.0)
faraday (0.12.1)
ffi (1.9.18)
fluent-logger (0.7.1)
fluent-plugin-concat (2.4.0)
fluent-plugin-elasticsearch (3.5.2, 1.9.5, 1.4.0)
fluent-plugin-grok-parser (2.1.4)
fluent-plugin-kafka (0.5.5)
fluent-plugin-record-modifier (0.6.0)
fluent-plugin-rewrite-tag-filter (1.5.5)
fluent-plugin-s3 (1.0.0.rc3)
fluent-plugin-td (1.0.0.rc1)
fluent-plugin-td-monitoring (0.2.2)
fluent-plugin-webhdfs (1.1.1)
fluentd (1.6.3, 0.14.16)
hirb (0.7.3)
http_parser.rb (0.6.0)
httpclient (2.8.2.4)
io-console (default: 0.4.6)
ipaddress (0.8.3)
jmespath (1.3.1)
json (default: 2.0.2)
ltsv (0.1.0)
mini_portile2 (2.1.0)
minitest (5.10.1)
mixlib-cli (1.7.0)
mixlib-config (2.2.4)
mixlib-log (1.7.1)
mixlib-shellout (2.2.7)
msgpack (1.1.0)
multi_json (1.12.1)
multipart-post (2.0.0)
net-telnet (0.1.1)
nokogiri (1.7.2)
ohai (6.20.0)
oj (2.18.5)
openssl (default: 2.0.3)
parallel (1.8.0)
power_assert (0.4.1)
psych (default: 2.2.2)
public_suffix (2.0.5)
rake (12.0.0)
rdoc (default: 5.0.0)
ruby-kafka (0.3.17)
ruby-progressbar (1.8.1)
rubyzip (1.1.7)
serverengine (2.0.5)
sigdump (0.2.4)
strptime (0.2.3, 0.1.9)
systemu (2.5.2)
td (0.15.2)
td-client (1.0.0, 0.8.85)
td-logger (0.3.27)
test-unit (3.2.3)
thread_safe (0.3.6)
tzinfo (1.2.3)
tzinfo-data (1.2017.2)
uuidtools (2.1.5)
webhdfs (0.8.0)
xmlrpc (0.2.1)
yajl-ruby (1.3.0)
zip-zip (0.3)
  • ES version (optional)
    7.5.1

Your Error Log

Entire td-agent.log file is filled with this error & nothing else:

2020-06-28 04:24:42 +0530 [error]: #0 failed to flush the buffer, and hit limit for retries. dropping all chunks in the buffer queue. retry_times=3 records=2 error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"redacted\", :password=>\"obfuscated\"}, {:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"redacted\", :password=>\"obfuscated\"}, {:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"redacted\", :password=>\"obfuscated\"}, {:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"redacted\", :password=>\"obfuscated\"}): end of file reached (EOFError)"
  2020-06-28 04:24:42 +0530 [error]: #0 suppressed same stacktrace

In the boot log of td-agent, I find this error: 2020-06-28 11:17:14 +0530 [error]: #0 unexpected error error_class=IPAddr::InvalidAddressError error="invalid address"

I'm giving the hosts in the config file in following format:
hosts redacted.fqdn:9200,redacted.fqdn2:9200

Tried without port number too, still the same error. Also there's another server where same EOFError is present in the td-agent.log but no error in the boot log, configs are same except for the log specific values like pattern n path.

@himanshigpta
Copy link
Author

himanshigpta commented Jun 28, 2020

@repeatedly @cosmo0920 Could you please provide some insight here...?

@cosmo0920
Copy link
Collaborator

cosmo0920 commented Jul 3, 2020

First, they are not buffer parameter:

    reconnect_on_error true
    reload_on_failure true
    reload_connections false

Your configuration should be:

<match new_fta_logs*>
  @type elasticsearch
  log_es_400_reason true
  user <redacted>
  password <redacted>
  type_name "_doc"
  ssl_version TLSv1_2
  ca_file "/etc/path_to/file.crt"
  hosts <redacted>
  scheme "https"
  logstash_format true
  logstash_dateformat %V
  logstash_prefix fta_logs
  include_timestamp true
  reconnect_on_error true
  reload_on_failure true
  reload_connections false
  <buffer>
    @type file
    path /etc/td-agent/buffers_fta_new
    chunk_limit_size 1M
    flush_interval 5s
    retry_forever false
    retry_max_times 3
    retry_wait 10
    retry_max_interval 300
    flush_thread_count 8
#       request_timeout 2147483648
  </buffer>
</match>

unexpected error error_class=IPAddr::InvalidAddressError error="invalid address" says just passing IPadress should be wrong.

And using elasticsearch plugin and its client are too old.

  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/faraday-0.12.1/lib/faraday/connection.rb:88:in `block in initialize'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/faraday-0.12.1/lib/faraday/options.rb:75:in `fetch'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/faraday-0.12.1/lib/faraday/connection.rb:83:in `initialize'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/http/faraday.rb:38:in `new'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/http/faraday.rb:38:in `__build_connection'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:138:in `block in __build_connections'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:130:in `map'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:130:in `__build_connections'
  2020-06-28 11:17:00 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-5.0.4/lib/elasticsearch/transport/transport/base.rb:40:in `initialize'

The latest stable version of elasticsearch-ruby which is ES client for ruby should be used to communicate with ES 7.5:
https://github.com/elastic/elasticsearch-ruby#compatibility

@himanshigpta
Copy link
Author

@cosmo0920 Thanks for your suggestion! I updated the gems yesterday on few servers, the error persists:

2020-07-07 12:08:55 +0530 [warn]: #0 suppressed same stacktrace
2020-07-07 12:08:55 +0530 [warn]: #0 failed to flush the buffer. retry_time=3 next_retry_seconds=2020-07-07 12:09:34 +0530 chunk="5a9d43b492e300b5fceac8c5275becfd" error_class=F
luent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"redacted\", :port=>9200, :sc
heme=>\"https\", :user=>\"elastic\", :password=>\"obfuscated\"}, {:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"elastic\", :password=>\"
obfuscated\"}, {:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"elastic\", :password=>\"obfuscated\"}, {:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"elastic\", :password=>\"obfuscated\"}): end of file reached (EOFError)"

Boot log:

2020-07-07 12:31:48 +0530 [error]: #0 unexpected error error_class=IPAddr::InvalidAddressError error="invalid address"
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/ipaddr.rb:563:in `in6_addr'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/ipaddr.rb:500:in `initialize'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/ipaddr.rb:518:in `new'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/ipaddr.rb:518:in `coerce_other'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/ipaddr.rb:174:in `include?'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/uri/generic.rb:1541:in `block in find_proxy'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/uri/generic.rb:1530:in `scan'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/2.4.0/uri/generic.rb:1530:in `find_proxy'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/faraday-0.17.3/lib/faraday/connection.rb:454:in `proxy_from_env'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/faraday-0.17.3/lib/faraday/connection.rb:86:in `initialize'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/faraday-0.17.3/lib/faraday.rb:69:in `new'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/faraday-0.17.3/lib/faraday.rb:69:in `new'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-7.5.0/lib/elasticsearch/transport/transport/http/faraday.rb:41:in `__build_connection'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-7.5.0/lib/elasticsearch/transport/transport/base.rb:144:in `block in __build_connections'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-7.5.0/lib/elasticsearch/transport/transport/base.rb:136:in `map'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-7.5.0/lib/elasticsearch/transport/transport/base.rb:136:in `__build_connections'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/elasticsearch-transport-7.5.0/lib/elasticsearch/transport/transport/base.rb:47:in `initialize'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-elasticsearch-3.5.2/lib/fluent/plugin/out_elasticsearch.rb:421:in `new'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-elasticsearch-3.5.2/lib/fluent/plugin/out_elasticsearch.rb:421:in `client'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-elasticsearch-3.5.2/lib/fluent/plugin/elasticsearch_index_template.rb:36:in `rescue in retry_operate'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-elasticsearch-3.5.2/lib/fluent/plugin/elasticsearch_index_template.rb:34:in `retry_operate'
  2020-07-07 12:31:48 +0530 [error]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-elasticsearch-3.5.2/lib/fluent/plugin/out_elasticsearch.rb:244:in `configure'


Also the hostnames that I'm passing in the hosts section aren't invalid, I'm able to ping them.

Gem list looks like this now:

# td-agent-gem list

*** LOCAL GEMS ***

addressable (2.5.1)
aws-sdk (2.9.19)
aws-sdk-core (2.9.19)
aws-sdk-resources (2.9.19)
aws-sigv4 (1.0.0)
bigdecimal (default: 1.3.0)
bundler (1.14.5)
bzip2-ffi (1.0.0)
cool.io (1.5.0)
did_you_mean (1.1.0)
dig_rb (1.0.1)
elasticsearch (7.5.0, 5.0.4)
elasticsearch-api (7.5.0, 5.0.4)
elasticsearch-transport (7.5.0, 5.0.4)
excon (0.55.0)
faraday (0.17.3, 0.12.1)
ffi (1.9.18)
fluent-logger (0.7.1)
fluent-plugin-concat (2.4.0)
fluent-plugin-elasticsearch (3.5.2, 1.9.5, 1.4.0)
fluent-plugin-grok-parser (2.1.4)
fluent-plugin-kafka (0.5.5)
fluent-plugin-record-modifier (0.6.0)
fluent-plugin-rewrite-tag-filter (1.5.5)
fluent-plugin-s3 (1.0.0.rc3)
fluent-plugin-td (1.0.0.rc1)
fluent-plugin-td-monitoring (0.2.2)
fluent-plugin-webhdfs (1.1.1)
fluentd (1.6.3, 0.14.16)
hirb (0.7.3)
http_parser.rb (0.6.0)
httpclient (2.8.2.4)
io-console (default: 0.4.6)
ipaddress (0.8.3)
jmespath (1.3.1)
json (default: 2.0.2)
ltsv (0.1.0)
mini_portile2 (2.1.0)
minitest (5.10.1)
mixlib-cli (1.7.0)
mixlib-config (2.2.4)
mixlib-log (1.7.1)
mixlib-shellout (2.2.7)
msgpack (1.1.0)
multi_json (1.12.1)
multipart-post (2.0.0)
net-telnet (0.1.1)
nokogiri (1.7.2)
ohai (6.20.0)
oj (2.18.5)
openssl (default: 2.0.3)
parallel (1.8.0)
power_assert (0.4.1)
psych (default: 2.2.2)
public_suffix (2.0.5)
rake (12.0.0)
rdoc (default: 5.0.0)
ruby-kafka (0.3.17)
ruby-progressbar (1.8.1)
rubyzip (1.1.7)
serverengine (2.0.5)
sigdump (0.2.4)
strptime (0.2.3, 0.1.9)
systemu (2.5.2)
td (0.15.2)
td-client (1.0.0, 0.8.85)
td-logger (0.3.27)
test-unit (3.2.3)
thread_safe (0.3.6)
tzinfo (1.2.3)
tzinfo-data (1.2017.2)
uuidtools (2.1.5)
webhdfs (0.8.0)
xmlrpc (0.2.1)
yajl-ruby (1.3.0)
zip-zip (0.3)

@cosmo0920
Copy link
Collaborator

cosmo0920 commented Jul 7, 2020

Thanks for the feedback.
I'd re-investigated this issue and I'd found that this issue should be ruby 2.4 itself issue.
https://bugzilla.redhat.com/show_bug.cgi?id=1474185
ruby/ruby#1513
Could you upgrade Ruby 2.5.0 or later?
Ruby 2.4 doesn't include this fix and already EOL.

@cosmo0920
Copy link
Collaborator

Or, could you test td-agent4 from here?
https://td-agent-package-browser.herokuapp.com/4/redhat/7/x86_64
Or, could you use these scripts to upgrade td-agent?
fluent/fluent-package-builder#123 (comment)

@himanshigpta
Copy link
Author

@cosmo0920 Thanks for the prompt reply! I'll try installing td-agent4 & will observe the logs for some time, will share the results.

@himanshigpta
Copy link
Author

@cosmo0920 Is there any documentation for the new version of td-agent that I can refer to? I searched online & might've missed, could you please share the link? I've installed the latest version:

# td-agent --version
td-agent 1.11.1

For simplicity, using /var/log/messages with pretty much same configuration :

<source>
  @type tail
  path /var/log/messages
  pos_file /path/to/var_log_msg_grok.log.pos
  time_format %b %dT%H:%M:%SZ
  tag var.msg
 # read_from_head true
  <parse>
    @type multiline_grok
#    multiline_start_regexp /[\w]+ \d{1,2} \d{1,2}:\d{1,2}:\d{1,2}/
    <grok>
     pattern %{SYSLOGTIMESTAMP:time}%{SPACE}%{HOSTNAME:hostname}%{SPACE}%{GREEDYDATA:service_name}:%{GREEDYDATA:log_meassage}
    </grok>
  </parse>
</source>

<filter var.msg>
    @type record_modifier
     <record>
     hostname "#{Socket.gethostname}"
     formatted_time ${Time.at(time).iso8601(3)}
     </record>
     char_encoding utf-8
     char_encoding utf-8:euc-jp
</filter>

<match var.msg>
  @type elasticsearch
  log_es_400_reason true
  user redacted
  password redacted
  type_name "_doc"
  ssl_version TLSv1_2
  ca_file "/path/to/abc.crt"
  hosts redacted
  scheme "https"
  logstash_format true
  logstash_dateformat %V
  logstash_prefix messages
  include_timestamp true
  reconnect_on_error true
  reload_on_failure true
  <buffer>
    @type file
    path /path/to/messages/buffers
    chunk_limit_size 1M
    flush_interval 5s
    retry_forever false
    retry_max_times 3
    retry_wait 10
    retry_max_interval 300
    flush_thread_count 8
  </buffer>
</match>

Boot log:

2020-07-08 19:09:33 +0530 [info]: parsing config file is succeeded path="/etc/td-agent/td-agent.conf"
2020-07-08 19:09:33 +0530 [info]: gem 'fluent-plugin-concat' version '2.4.0'
2020-07-08 19:09:33 +0530 [info]: gem 'fluent-plugin-elasticsearch' version '4.0.9'
2020-07-08 19:09:33 +0530 [info]: gem 'fluent-plugin-grok-parser' version '0.0.2'
2020-07-08 19:09:33 +0530 [info]: gem 'fluent-plugin-kafka' version '0.13.0'
2020-07-08 19:09:33 +0530 [info]: gem 'fluent-plugin-prometheus' version '1.8.0'
2020-07-08 19:09:33 +0530 [info]: gem 'fluent-plugin-prometheus_pushgateway' version '0.0.2'
2020-07-08 19:09:33 +0530 [info]: gem 'fluent-plugin-record-modifier' version '2.1.0'
2020-07-08 19:09:33 +0530 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.3.0'
2020-07-08 19:09:33 +0530 [info]: gem 'fluent-plugin-s3' version '1.3.3'
2020-07-08 19:09:33 +0530 [info]: gem 'fluent-plugin-systemd' version '1.0.2'
2020-07-08 19:09:33 +0530 [info]: gem 'fluent-plugin-td' version '1.1.0'
2020-07-08 19:09:33 +0530 [info]: gem 'fluent-plugin-td-monitoring' version '1.0.0'
2020-07-08 19:09:33 +0530 [info]: gem 'fluent-plugin-webhdfs' version '1.2.5'
2020-07-08 19:09:33 +0530 [info]: gem 'fluentd' version '1.11.1'
2020-07-08 19:09:33 +0530 [error]: config error file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError error="Unknown parser plugin 'multiline_grok'. Run 'gem search -rd fluent-plugin' to find plugins"

I'm getting the above error even after installing grok-parser plugin, following is the gem list:

# td-agent-gem list

*** LOCAL GEMS ***

addressable (2.7.0)
async (1.26.2)
async-http (0.52.4)
async-io (1.30.0)
async-pool (0.3.2)
aws-eventstream (1.1.0)
aws-partitions (1.337.0)
aws-sdk-core (3.102.1)
aws-sdk-kms (1.35.0)
aws-sdk-s3 (1.72.0)
aws-sdk-sqs (1.29.0)
aws-sigv4 (1.2.1)
benchmark (default: 0.1.0)
bigdecimal (default: 2.0.0)
bundler (2.1.4)
cgi (default: 0.1.0)
chef-config (16.2.73)
chef-utils (16.2.73)
concurrent-ruby (1.1.6)
console (1.8.2)
cool.io (1.6.0)
csv (default: 3.1.2)
date (default: 3.0.0)
delegate (default: 0.1.0)
did_you_mean (default: 1.4.0)
digest-crc (0.6.1)
elasticsearch (7.8.0)
elasticsearch-api (7.8.0)
elasticsearch-transport (7.8.0)
etc (default: 1.1.0)
excon (0.75.0)
faraday (1.0.1)
fcntl (default: 1.0.0)
ffi (1.13.1)
ffi-yajl (2.3.3)
fiddle (default: 1.0.0)
fileutils (default: 1.4.1)
fluent-config-regexp-type (1.0.0)
fluent-logger (0.8.2)
fluent-plugin-concat (2.4.0)
fluent-plugin-elasticsearch (4.0.9)
fluent-plugin-grok-parser (0.0.2)
fluent-plugin-kafka (0.13.0)
fluent-plugin-prometheus (1.8.0)
fluent-plugin-prometheus_pushgateway (0.0.2)
fluent-plugin-record-modifier (2.1.0)
fluent-plugin-rewrite-tag-filter (2.3.0)
fluent-plugin-s3 (1.3.3)
fluent-plugin-systemd (1.0.2)
fluent-plugin-td (1.1.0)
fluent-plugin-td-monitoring (1.0.0)
fluent-plugin-webhdfs (1.2.5)
fluentd (1.11.1)
forwardable (default: 1.3.1)
fuzzyurl (0.9.0)
getoptlong (default: 0.1.0)
hirb (0.7.3)
http_parser.rb (0.6.0)
httpclient (2.8.3, 2.8.2.4)
io-console (default: 0.5.6)
ipaddr (default: 1.2.2)
ipaddress (0.8.3)
irb (default: 1.2.3)
jmespath (1.4.0)
json (default: 2.3.0)
libyajl2 (1.2.0)
logger (default: 1.4.2)
ltsv (0.1.2)
matrix (default: 0.2.0)
mini_portile2 (2.5.0)
minitest (5.13.0)
mixlib-cli (2.1.6, 1.7.0)
mixlib-config (3.0.6, 2.2.3)
mixlib-log (3.0.8, 1.7.1)
mixlib-shellout (3.0.9, 2.2.7)
msgpack (1.3.3)
multi_json (1.14.1)
multipart-post (2.1.1)
mutex_m (default: 0.1.0)
net-pop (default: 0.1.0)
net-smtp (default: 0.1.0)
net-telnet (0.2.0)
nio4r (2.5.2)
nokogiri (1.11.0.rc2)
observer (default: 0.1.0)
ohai (16.2.3, 16.2.0, 6.20.0)
oj (3.10.6)
open3 (default: 0.1.0)
openssl (default: 2.1.2)
ostruct (default: 0.2.0)
parallel (1.19.2)
plist (3.5.0)
power_assert (1.1.7)
prime (default: 0.1.1)
prometheus-client (0.9.0)
protocol-hpack (1.4.2)
protocol-http (0.20.0)
protocol-http1 (0.13.0)
protocol-http2 (0.14.0)
pstore (default: 0.1.0)
psych (default: 3.1.0)
public_suffix (4.0.5)
quantile (0.2.1)
racc (default: 1.4.16)
rake (13.0.1)
rdkafka (0.8.0)
rdoc (default: 6.2.1)
readline (default: 0.0.2)
readline-ext (default: 0.1.0)
reline (default: 0.1.3)
rexml (default: 3.2.3)
rss (default: 0.2.8)
ruby-kafka (1.1.0)
ruby-progressbar (1.10.1)
rubyzip (1.3.0)
sdbm (default: 1.0.0)
serverengine (2.2.1)
sigdump (0.2.4)
singleton (default: 0.1.0)
stringio (default: 0.1.0)
strptime (0.2.4)
strscan (default: 1.0.3)
systemd-journal (1.3.3)
systemu (2.6.5, 2.5.2)
td (0.16.9)
td-client (1.0.7)
td-logger (0.3.27)
test-unit (3.3.4)
timeout (default: 0.1.0)
timers (4.3.0)
tomlrb (1.3.0)
tracer (default: 0.1.0)
tzinfo (2.0.2)
tzinfo-data (1.2020.1)
uri (default: 0.10.0)
webhdfs (0.9.0)
webrick (default: 1.6.0)
wmi-lite (1.0.5)
xmlrpc (0.3.0)
yajl-ruby (1.4.1)
yaml (default: 0.1.0)
zip-zip (0.3)
zlib (default: 1.1.0)

@himanshigpta
Copy link
Author

@cosmo0920 Can you please suggest if this requires any other plugin? :

Error:

2020-07-08 19:17:31 +0530 [error]: config error file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError error="Unknown parser plugin 'multiline_grok'. Run 'gem search -rd fluent-plugin' to find plugins"

td-agent]# td-agent-gem search -rd --local fluent-plugin

*** LOCAL GEMS ***

fluent-plugin-concat (2.4.0)
    Author: Kenji Okimoto
    Homepage:
    https://github.com/fluent-plugins-nursery/fluent-plugin-concat
    License: MIT
    Installed at: /opt/td-agent/lib/ruby/gems/2.7.0

    Fluentd Filter plugin to concat multiple event messages

fluent-plugin-elasticsearch (4.0.9)
    Authors: diogo, pitr, Hiroshi Hatake
    Homepage: https://github.com/uken/fluent-plugin-elasticsearch
    License: Apache-2.0
    Installed at: /opt/td-agent/lib/ruby/gems/2.7.0

    Elasticsearch output plugin for Fluent event collector

fluent-plugin-grok-parser (0.0.2)
    Author: kiyoto
    Homepage: https://github.com/kiyoto/fluent-plugin-grok-parser
    License: Apache License, Version 2.0
    Installed at: /opt/td-agent/lib/ruby/gems/2.7.0

    Fluentd plugin to suppor Logstash-inspired Grok format for parsing
    logs

fluent-plugin-kafka (0.13.0)
    Authors: Hidemasa Togashi, Masahiro Nakagawa
    Homepage: https://github.com/fluent/fluent-plugin-kafka
    License: Apache-2.0
    Installed at: /opt/td-agent/lib/ruby/gems/2.7.0

    Fluentd plugin for Apache Kafka > 0.8

fluent-plugin-prometheus (1.8.0)
    Author: Masahiro Sano
    Homepage: https://github.com/fluent/fluent-plugin-prometheus
    License: Apache-2.0
    Installed at: /opt/td-agent/lib/ruby/gems/2.7.0

    A fluent plugin that collects metrics and exposes for Prometheus.

fluent-plugin-prometheus_pushgateway (0.0.2)
    Author: Yuta Iwama
    Homepage:
    https://github.com/fluent/fluent-plugin-prometheus_pushgateway
    License: Apache-2.0
    Installed at: /opt/td-agent/lib/ruby/gems/2.7.0

    A fluent plugin for prometheus pushgateway

fluent-plugin-record-modifier (2.1.0)
    Author: Masahiro Nakagawa
    Homepage:
    https://github.com/repeatedly/fluent-plugin-record-modifier
    License: MIT
    Installed at: /opt/td-agent/lib/ruby/gems/2.7.0

    Filter plugin for modifying event record

fluent-plugin-rewrite-tag-filter (2.3.0)
    Author: Kentaro Yoshida
    Homepage: https://github.com/fluent/fluent-plugin-rewrite-tag-filter
    License: Apache-2.0
    Installed at: /opt/td-agent/lib/ruby/gems/2.7.0

    Fluentd Output filter plugin. It has designed to rewrite tag like
    mod_rewrite. Re-emmit a record with rewrited tag when a value
    matches/unmatches with the regular expression. Also you can change a
    tag from apache log by domain, status-code(ex. 500 error),
    user-agent, request-uri, regex-backreference and so on with regular
    expression.

fluent-plugin-s3 (1.3.3)
    Authors: Sadayuki Furuhashi, Masahiro Nakagawa
    Homepage: https://github.com/fluent/fluent-plugin-s3
    License: Apache-2.0
    Installed at: /opt/td-agent/lib/ruby/gems/2.7.0

    Amazon S3 output plugin for Fluentd event collector

fluent-plugin-systemd (1.0.2)
    Author: Ed Robinson
    Homepage: https://github.com/reevoo/fluent-plugin-systemd
    License: Apache-2.0
    Installed at: /opt/td-agent/lib/ruby/gems/2.7.0

    Input plugin to read from systemd journal.

fluent-plugin-td (1.1.0)
    Author: Treasure Data, Inc.
    Homepage: http://www.treasuredata.com/
    License: Apache-2.0
    Installed at: /opt/td-agent/lib/ruby/gems/2.7.0

    Treasure Data Cloud Data Service plugin for Fluentd

fluent-plugin-td-monitoring (1.0.0)
    Author: Masahiro Nakagawa
    Homepage: http://www.treasuredata.com/
    License: MIT
    Installed at: /opt/td-agent/lib/ruby/gems/2.7.0

fluent-plugin-webhdfs (1.2.5)
    Author: TAGOMORI Satoshi
    Homepage: https://github.com/fluent/fluent-plugin-webhdfs
    License: Apache-2.0
    Installed at: /opt/td-agent/lib/ruby/gems/2.7.0

    Fluentd plugin to write data on HDFS over WebHDFS, with flexible
    formatting

@repeatedly
Copy link
Contributor

fluent-plugin-grok-parser (0.0.2)

The problem seems you installed old grok parser plugin.
Latest version is v2.6.1 but you installed v0.0.2: https://rubygems.org/gems/fluent-plugin-grok-parser

@himanshigpta
Copy link
Author

himanshigpta commented Jul 9, 2020

@repeatedly @cosmo0920 Thanks! Installing v2.6.1 solved the issue, td-agent is running now, but I'm getting another error now :

2020-07-09 16:34:49 +0530 [error]: #0 incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/parser_regexp.rb:50:in `match'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/parser_regexp.rb:50:in `parse'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluent-plugin-grok-parser-2.6.1/lib/fluent/plugin/parser_multiline_grok.rb:21:in `block in parse'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluent-plugin-grok-parser-2.6.1/lib/fluent/plugin/parser_multiline_grok.rb:20:in `each'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluent-plugin-grok-parser-2.6.1/lib/fluent/plugin/parser_multiline_grok.rb:20:in `parse'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:546:in `block in parse_multilines'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:544:in `each'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:544:in `parse_multilines'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:469:in `call'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:469:in `receive_lines'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:845:in `block in handle_notify'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:877:in `with_io'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:825:in `handle_notify'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:808:in `block in on_notify'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:808:in `synchronize'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:808:in `on_notify'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:632:in `detach'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:423:in `detach_watcher'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:383:in `block in stop_watchers'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:378:in `each'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:378:in `stop_watchers'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/plugin/in_tail.rb:241:in `shutdown'
  2020-07-09 16:34:49 +0530 [error]: #0 /opt/td-agent/lib/ruby/gems/2.7.0/gems/fluentd-1.11.1/lib/fluent/root_agent.rb:277:in `block (3 levels) in shutdown'
2020-07-09 16:34:49 +0530 [info]: #0 shutting down output plugin type=:elasticsearch plugin_id="object:71c"
2020-07-09 16:34:49 +0530 [info]: #0 shutting down filter plugin type=:record_modifier plugin_id="object:898"
2020-07-09 16:34:49 +0530 [info]: Worker 0 finished with status 0

@himanshigpta
Copy link
Author

@cosmo0920 @repeatedly I'm facing the same issue described here : #763 . Since I'm already using latest version of fluent-plugin-elasticsearch:

td-agent]# td-agent-gem list | grep fluent-plugin
fluent-plugin-elasticsearch (4.0.9)

I tried the parameters you suggested here: #763 (comment)

Here's what I'm getting in the logs now:

2020-07-10 10:59:56 +0530 [error]: config error file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError error="valid options are SSLv23,TLSv1,TLSv1_1,TLSv1_2 but got TLSv1_3"

Here's what I added in the config:

ssl_max_version TLSv1_3
ssl_min_version TLSv1_2

Original error before ssl_min/max parameter update:

2020-07-10 10:24:27 +0530 [warn]: #0 retry succeeded. chunk_id="5aa0f21c350e319ac0b0e0e86cc19a2b"
2020-07-10 10:26:34 +0530 [warn]: #0 failed to flush the buffer. retry_time=0 next_retry_seconds=2020-07-10 10:26:44.627523561 +0530 chunk="5aa0f29f298a3417781af771a568020c" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"elastic\", :password=>\"obfuscated\"}, {:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"elastic\", :password=>\"obfuscated\"}, {:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"elastic\", :password=>\"obfuscated\"}, {:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"elastic\", :password=>\"obfuscated\"}, {:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"elastic\", :password=>\"obfuscated\"}): hostname \"redacted\" does not match the server certificate (OpenSSL::SSL::SSLError)"
  2020-07-10 10:26:34 +0530 [warn]: #0 suppressed same stacktrace

@cosmo0920
Copy link
Collaborator

cosmo0920 commented Jul 10, 2020

Here's what I'm getting in the logs now:

2020-07-10 10:59:56 +0530 [error]: config error file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError error="valid options are SSLv23,TLSv1,TLSv1_1,TLSv1_2 but got TLSv1_3"

Here's what I added in the config:

ssl_max_version TLSv1_3
ssl_min_version TLSv1_2

This seems that your environment does not support TLSv1_3.

ssl_max_version TLSv1_3
ssl_min_version TLSv1_2

should be

ssl_version TLSv1_2

ref: https://github.com/uken/fluent-plugin-elasticsearch#clienthost-certificate-options

@himanshigpta
Copy link
Author

@cosmo0920 ssl_version TLSv1_2 is the parameter I've been using, it's been few days now I haven't seen any EOF errors on any of the servers where I installed the td-agent4! :) The logs are getting shipped as expected, but I still see the OpenSSl error in the logs of all those servers:

2020-07-13 14:46:19 +0530 [info]: #0 detected rotation of /var/log/hadoop-hdfs/hadoop.log.out; waiting 5 seconds
2020-07-13 14:46:19 +0530 [info]: #0 following tail of /var/log/hadoop-hdfs/hadoop.log.out
2020-07-14 02:03:28 +0530 [warn]: #0 failed to flush the buffer. retry_time=0 next_retry_seconds=2020-07-14 02:03:38.560797464 +0530 chunk="5aa589a24ce02c188f291540bfeb64af" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"elastic\", :password=>\"obfuscated\"}, {:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"elastic\", :password=>\"obfuscated\"}, {:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"elastic\", :password=>\"obfuscated\"}, {:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"elastic\", :password=>\"obfuscated\"}, {:host=>\"redacted\", :port=>9200, :scheme=>\"https\", :user=>\"elastic\", :password=>\"obfuscated\"}): hostname \"redacted<ES Master>\" does not match the server certificate (OpenSSL::SSL::SSLError)"
  2020-07-14 02:03:28 +0530 [warn]: #0 suppressed same stacktrace
2020-07-14 02:03:38 +0530 [warn]: #0 retry succeeded. chunk_id="5aa589a24ce02c188f291540bfeb64af"

Didn't face this issue with the previous version of td-agent.

@cosmo0920
Copy link
Collaborator

cosmo0920 commented Jul 14, 2020

Thanks for trying to use td-agent4.

BTW, chunk="5aa589a24ce02c188f291540bfeb64af" should be handled correctly after SSL error....
See, 2020-07-14 02:03:38 +0530 [warn]: #0 retry succeeded. chunk_id="5aa589a24ce02c188f291540bfeb64af" line.

retry_time=0 next_retry_seconds=2020-07-14 02:03:38.560797464 +0530 chunk="5aa589a24ce02c188f291540bfeb64af" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>"redacted", :port=>9200, :scheme=>"https", :user=>"elastic", :password=>"obfuscated"}, {:host=>"redacted", :port=>9200, :scheme=>"https", :user=>"elastic", :password=>"obfuscated"}, {:host=>"redacted", :port=>9200, :scheme=>"https", :user=>"elastic", :password=>"obfuscated"}, {:host=>"redacted", :port=>9200, :scheme=>"https", :user=>"elastic", :password=>"obfuscated"}, {:host=>"redacted", :port=>9200, :scheme=>"https", :user=>"elastic", :password=>"obfuscated"}): hostname "redacted" does not match the server certificate (OpenSSL::SSL::SSLError)"
2020-07-14 02:03:28 +0530 [warn]: #0 suppressed same stacktrace
2020-07-14 02:03:38 +0530 [warn]: #0 retry succeeded. chunk_id="5aa589a24ce02c188f291540bfeb64af"

@himanshigpta
Copy link
Author

@cosmo0920 Yes, the logs are getting shipped, & though the OpenSSL error occurrence is not very frequent yet, is there a way to get rid of it? Will it pose any issue in future that you might know of..?

With TLSv1_2, I'm getting the error(hostname "redacted" does not match the server certificate (OpenSSL::SSL::SSLError)) reported earlier,
with TLS1_3 : error_class=Fluent::ConfigError error="valid options are SSLv23,TLSv1,TLSv1_1,TLSv1_2 but got TLSv1_3"),
with SSLv_23 :
[warn]: #0 Detected ES 6.x or above and enabled insecure security: You might have to specify ssl_version TLSv1_2 in configuration.

Thanks again for your help! :)

@cosmo0920
Copy link
Collaborator

@cosmo0920 Yes, the logs are getting shipped, & though the OpenSSL error occurrence is not very frequent yet, is there a way to get rid of it? Will it pose any issue in future that you might know of..?

I have no idea for getting rid of OpenSSL error....

@himanshigpta
Copy link
Author

@cosmo0920 Now the entire log file is filled with OpenSSL error, & td-agent stopped sending logs to ES. After restart it again starts working fine. One thing to note is that, while installing td-agent4, it prompted me to update openssl packages:

Installing:
 td-agent                                x86_64                           4.0.0-1.el7                               /td-agent-4.0.0-1.el7.x86_64                            56 M
Installing for dependencies:
 libyaml                                 x86_64                           0.1.4-11.el7_0                            rhel                                                    55 k
Updating for dependencies:
 openssl                                 x86_64                           1:1.0.2k-19.el7                           rhel                                                   493 k
 openssl-devel                           x86_64                           1:1.0.2k-19.el7                           rhel                                                   1.5 M
 openssl-libs                            x86_64                           1:1.0.2k-19.el7                           rhel                                                   1.2 M

Transaction Summary
=================================================================================================================================================================================
Install  1 Package  (+1 Dependent package)
Upgrade             ( 3 Dependent packages)

Previous version of td-agent that I was using worked fine with the openssl version that was installed.

@cosmo0920
Copy link
Collaborator

Yeah, td-agent4 starts to use system Openssl libraries and depends on it.

@himanshigpta
Copy link
Author

@cosmo0920 we're using wildcard certificate on all the ES nodes including the node where td is running, & as per the openssl error it is trying to match IP with the server certificate(which basically has a pattern and not the exact IP/hostname since it is wildcard), is there a way to disable hostname matching on the td-agent side?
I could only find this : https://github.com/uken/fluent-plugin-elasticsearch#user-password-path-scheme-ssl_verify

@cosmo0920
Copy link
Collaborator

cosmo0920 commented Jul 17, 2020

@cosmo0920 we're using wildcard certificate on all the ES nodes including the node where td is running, & as per the openssl error it is trying to match IP with the server certificate(which basically has a pattern and not the exact IP/hostname since it is wildcard), is there a way to disable hostname matching on the td-agent side?
I could only find this : https://github.com/uken/fluent-plugin-elasticsearch#user-password-path-scheme-ssl_verify

I'd investigated this issue but Faraday does not support verify_hostname option in SSL:
https://github.com/lostisland/faraday/blob/d77c9efee9b12763f685f46df470eb22351154f5/lib/faraday/adapter/httpclient.rb#L94-L104
https://github.com/lostisland/faraday/blob/master/lib/faraday/options/ssl_options.rb#L44-L47

https://github.com/elastic/elasticsearch-ruby/blob/ee83caf8ea090775b19190341eeee2cd627fdcd0/elasticsearch-transport/lib/elasticsearch/transport/client.rb#L136-L165
https://github.com/excon/excon/blob/79566350fd7b226866dcf4c42aa84a059b9f241b/lib/excon/ssl_socket.rb

Method call graph:

ELasticsearch Ruby client --- delegated to with transport_options: ssl_options ---> Faraday::Conections ---> delegated to HTTP excon adapter ---> excon

@cosmo0920
Copy link
Collaborator

cosmo0920 commented Jul 17, 2020

I'd noticed that the dependent gem of excon also does not support OpenSSL::SSL::SSLContext#verify_hostname=:
excon/excon#722

@himanshigpta
Copy link
Author

@cosmo0920 So this means openssl issue is not related to td-agent or its dependencies, will see if I can figure out the reason. Thanks again for your help, time & patience to explain! :)

@cosmo0920
Copy link
Collaborator

Thanks for the patience. We didn't notice the issue which is originated from un-bundled openssl libraries and use system openssl libraries.

@cosmo0920
Copy link
Collaborator

cosmo0920 commented Jul 27, 2020

Faraday should handle verify_hostname= in SSLOption (such that ssl: {verify_hostname: true/false}). See: lostisland/faraday#1172

@cosmo0920
Copy link
Collaborator

cosmo0920 commented Jul 27, 2020

elasticsearch-transport should pass-through ssl options via transport_options:

We should send patches into faraday and excon to implement verify_hostname handling feature.
Sending a patch for excon is done. For Faraday is still running.

@cosmo0920
Copy link
Collaborator

cosmo0920 commented Aug 6, 2020

This issue has been fixed. Handling verifying hostname issue should be handled in dependent gems' issue trackers.
This issue should be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants