Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash and log spew on startup about disabling filter chain optimization #138

Closed
ramyala opened this issue May 23, 2018 · 16 comments
Closed

Comments

@ramyala
Copy link

ramyala commented May 23, 2018

I've been seeing this log when using the papertrail docker as defined by fluent-plugin-papertrail.
I used to see a crash at some point but I haven't seen that happen recently. I do see these logs being thrown on startup of fluentd. Could they be ignored?

The configuration as part of the papertrail package is:

<filter kubernetes.**>
  @type kubernetes_metadata
  annotation_match ["solarwinds.io/*"]
</filter>

# append namespace and pod name to hostname, so that logs in Papertrail are filterable by each
# use container name as program name, but trim it to 32 characters to match remote_syslog spec
<filter kubernetes.**>
  @type record_transformer
  enable_ruby true
  <record>
    hostname #{ENV['FLUENT_HOSTNAME']}-${record["kubernetes"]["namespace_name"]}-${record["kubernetes"]["pod_name"]}
    program ${record["kubernetes"]["container_name"][0..31]}
    severity info
    facility local0
    message ${record['log']}
  </record>
  remove_keys ["log"]
</filter>

2018-05-23 23:46:58 +0000 [info]: starting fluentd-1.2.0 pid=1 ruby="2.3.3"
2018-05-23 23:46:58 +0000 [info]: spawn command to main:  cmdline=["/usr/bin/ruby2.3", "-Eascii-8bit:ascii-8bit", "/usr/local/bin/fluentd", "-c", "/fluentd/etc/fluent.conf", "-p", "/fluentd/plugins", "--under-supervisor"]
2018-05-23 23:47:00 +0000 [info]: gem 'fluent-mixin-config-placeholders' version '0.4.0'
2018-05-23 23:47:00 +0000 [info]: gem 'fluent-plugin-kubernetes_metadata_filter' version '2.1.2'
2018-05-23 23:47:00 +0000 [info]: gem 'fluent-plugin-kubernetes_metadata_input' version '0.21.11'
2018-05-23 23:47:00 +0000 [info]: gem 'fluent-plugin-papertrail' version '0.2.4'
2018-05-23 23:47:00 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.0.2'
2018-05-23 23:47:00 +0000 [info]: gem 'fluent-plugin-systemd' version '0.3.1'
2018-05-23 23:47:00 +0000 [info]: gem 'fluentd' version '1.2.0'
2018-05-23 23:47:00 +0000 [info]: gem 'fluentd' version '0.12.43'
2018-05-23 23:47:00 +0000 [info]: adding match pattern="fluent.**" type="null"
2018-05-23 23:47:00 +0000 [info]: adding filter pattern="kubernetes.**" type="kubernetes_metadata"
2018-05-23 23:47:01 +0000 [info]: adding filter pattern="kube-apiserver-audit" type="record_transformer"
2018-05-23 23:47:01 +0000 [info]: adding filter pattern="kubernetes.**" type="record_transformer"
2018-05-23 23:47:01 +0000 [info]: adding match pattern="kubernetes.**" type="rewrite_tag_filter"
2018-05-23 23:47:01 +0000 [info]: #0 adding rewrite_tag_filter rule: $['kubernetes']['namespace_name'] [#<Fluent::PluginHelper::RecordAccessor::Accessor:0x007f72ea62d1f8 @keys=["kubernetes", "namespace_name"], @last_key="namespace_name", @dig_keys=["kubernetes"]>, /^(.+)$/, "", "$1.${tag}"]
2018-05-23 23:47:01 +0000 [info]: adding match pattern="staging.**" type="papertrail"
2018-05-23 23:47:01 +0000 [info]: #0 initializing tcp socket for logs5.papertrailapp.com:49772
2018-05-23 23:47:02 +0000 [info]: adding match pattern="production.**" type="papertrail"
2018-05-23 23:47:02 +0000 [info]: #0 initializing tcp socket for logs2.papertrailapp.com:25453
2018-05-23 23:47:02 +0000 [info]: adding match pattern="**" type="papertrail"
2018-05-23 23:47:02 +0000 [info]: #0 initializing tcp socket for logs5.papertrailapp.com:45389
2018-05-23 23:47:02 +0000 [info]: adding source type="tail"
2018-05-23 23:47:02 +0000 [info]: adding source type="tail"
2018-05-23 23:47:02 +0000 [info]: adding source type="tail"
2018-05-23 23:47:02 +0000 [info]: adding source type="tail"
2018-05-23 23:47:02 +0000 [info]: adding source type="tail"
2018-05-23 23:47:02 +0000 [info]: adding source type="tail"
2018-05-23 23:47:02 +0000 [info]: adding source type="tail"
2018-05-23 23:47:02 +0000 [info]: adding source type="tail"
2018-05-23 23:47:02 +0000 [info]: adding source type="tail"
2018-05-23 23:47:02 +0000 [info]: adding source type="tail"
2018-05-23 23:47:02 +0000 [info]: adding source type="tail"
2018-05-23 23:47:02 +0000 [info]: adding source type="tail"
2018-05-23 23:47:02 +0000 [info]: adding source type="tail"
2018-05-23 23:47:02 +0000 [info]: adding source type="tail"
2018-05-23 23:47:02 +0000 [info]: #0 starting fluentd worker pid=16 ppid=1 worker=0
2018-05-23 23:47:02 +0000 [info]: #0 following tail of /var/log/kube-proxy.log
2018-05-23 23:47:02 +0000 [info]: #0 following tail of /var/log/containers/l7-default-backend-5d5b9874d5-m65cq_kube-system_default-http-backend-6e2873e8479bc80f5dacab5fad3956541d283cb9fef13810fd4a618b869d7d19.log
2018-05-23 23:47:02 +0000 [info]: #0 following tail of /var/log/containers/l7-default-backend-5d5b9874d5-m65cq_kube-system_default-http-backend-242b33a9991611666abbe6472a0e8e6b27ec31c1dcdcbd1e6de502794d705023.log
2018-05-23 23:47:02 +0000 [info]: #0 following tail of /var/log/containers/running-quokka-harper-api-689b4b9749-7fgwn_production_api-cbf1738d9750a759a820e1ecf1bdfd7653cb0902d39388770bb702206a82f43a.log
2018-05-23 23:47:02 +0000 [info]: #0 following tail of /var/log/containers/goofys-kdzp8_finn_goofys-de61a4ce00a11030cd5c816f23052a537e18db5d17e48c5c9900dc915c258a65.log
2018-05-23 23:47:02 +0000 [info]: #0 following tail of /var/log/containers/gcsfuse-ncl9p_finn_gcsfuse-3ac054704e742f123a8d61894175c531b94e45fbeb330850927d55d6e14b2b2e.log
2018-05-23 23:47:02 +0000 [info]: #0 disable filter chain optimization because [Fluent::Plugin::KubernetesMetadataFilter, Fluent::Plugin::RecordTransformerFilter] uses `#filter_stream` method.
2018-05-23 23:47:02 +0000 [info]: #0 following tail of /var/log/containers/kube-proxy-gke-finn-default-pool-e7edd064-ph35_kube-system_kube-proxy-76cb23b3176e2e21bc851a485bf92d603be8ad1aaf07392d416c55f8693a0123.log
2018-05-23 23:47:02 +0000 [info]: #0 following tail of /var/log/containers/kube-dns-788979dc8f-2pwbm_kube-system_sidecar-bf6117bf8018eb29ed32abcc68a1cd0a5344668a27c33a62813f00b816302b40.log
2018-05-23 23:47:02 +0000 [info]: #0 following tail of /var/log/containers/kube-dns-788979dc8f-2pwbm_kube-system_dnsmasq-d1300fc125f4105ecbe863afa1b8a422e55f4ad9dc97313551c586fa1ae65c31.log
2018-05-23 23:47:02 +0000 [info]: #0 following tail of /var/log/containers/kube-dns-788979dc8f-2pwbm_kube-system_kubedns-0dddfacf2d7ed830ad7d5faff704645ec3cfd01dc2588b12b685a0166b987a13.log
2018-05-23 23:47:02 +0000 [info]: #0 following tail of /var/log/containers/running-quokka-harper-es-1_production_elasticsearch-0de05d301ad49f4ab7f2ee1db7ecde3c328b1d794fff24432493016b2e31ec3f.log
2018-05-23 23:47:02 +0000 [info]: #0 following tail of /var/log/containers/kube-dns-autoscaler-79b4b844b9-2br5q_kube-system_autoscaler-eff86cc65017296dba63121c25bc9749be3a23795ab6cb260d852d253b987604.log
2018-05-23 23:47:02 +0000 [info]: #0 following tail of /var/log/containers/running-quokka-harper-es-1_production_sysctl-d94d2708fb72244a202dc23f36e66b17d84a994a7fc1d762addb90ea4592217b.log
2018-05-23 23:47:02 +0000 [info]: #0 following tail of /var/log/containers/fluentd-papertrail-global-logging-chgn9_finn_fluentd-28295a328c2a9243dd5e7e4fe50b880c58ca19fa136faa27a3cd652151cdfb5d.log
2018-05-23 23:47:02 +0000 [info]: #0 disable filter chain optimization because [Fluent::Plugin::KubernetesMetadataFilter, Fluent::Plugin::RecordTransformerFilter] uses `#filter_stream` method.
2018-05-23 23:47:03 +0000 [info]: #0 following tail of /var/log/containers/kube-dns-788979dc8f-2pwbm_kube-system_prometheus-to-sd-922d2a1cc00cf74cbf4fd8c2f39f653b9d0d3e14f57f1f7ef111a47ae2652c3d.log
2018-05-23 23:47:03 +0000 [info]: #0 following tail of /var/log/containers/glusterfs-zhmlr_finn_glusterfs-12aa9adf2a037247a39e9b1ac744ff3a1d94c6dda4d8393e733470ca09c1c03c.log
2018-05-23 23:47:03 +0000 [info]: #0 fluentd worker is now running worker=0```
@richm
Copy link
Contributor

richm commented May 24, 2018

Those messages look ok - I suppose you could suppress them by using log_level warn

Not sure what the problem is - a stacktrace or error message would be helpful

@ramyala
Copy link
Author

ramyala commented May 24, 2018

How could i dump a stacktrace?

@richm
Copy link
Contributor

richm commented May 31, 2018

I used to see a crash at some point but I haven't seen that happen recently.

We need to see the crash, or an actual problem. What is the actual problem?

@pappnase99
Copy link

pappnase99 commented Dec 6, 2018

Maybe this is related to an issue i'm having right now. Fluentd log says:

2018-12-06 09:38:26 +0000 [info]: #0 disable filter chain optimization because [Fluent::Plugin::KubernetesMetadataFilter] uses `#filter_stream` method.

2018-12-06 09:38:26 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not match with data '2018-12-06 09:37:48.383 [INFO][37] health.go 150: Overall health summary=&health.HealthReport{Live:true, Ready:true}\n'" 
  location=nil 
  tag="kubernetes.var.log.containers.calico-node-lshc6_kube-system_calico-node-c7b419a505b0420809242ee07b0b7f4c71d579fcc50b87080178541cab96f027.log" 
  time=2018-12-06 09:37:48.383857690 +0000 
  record={"log"=>"2018-12-06 09:37:48.383 [INFO][37] health.go 150: Overall health summary=&health.HealthReport{Live:true, Ready:true}\n", "stream"=>"stdout"}

...endlessly more error messages...

I'm trying to use a filter for parsing structured logging in combination with this kubernetes plugin. Part of my config:

<source>
type tail
enable_stat_watcher false
path /var/log/containers/*.log
pos_file /var/log/fluentd-containers.log.pos
time_format %Y-%m-%dT%H:%M:%S.%NZ
tag kubernetes.*
format json
read_from_head true
</source>

<filter kubernetes.var.log.containers.**.log>
type kubernetes_metadata
</filter>

<filter kubernetes.var.log.containers.**.log>
@type parser
key_name log
reserve_data true
remove_key_name_field true
inject_key_prefix log.
<parse>
@type json
</parse>
</filter>

Ultimately, fluentd keeps crashing until i remove my parser filter.

@richm
Copy link
Contributor

richm commented Dec 6, 2018

What version of fluent-plugin-kubernetes_metadata_filter are you using? What version of fluentd? It looks like you want to do the JSON parsing of the log field in your parser plugin, but maybe the kubernetes_metadata has already done it? At any rate, the log field reported in the error is not a JSON value:

record={"log"=>"2018-12-06 09:37:48.383 [INFO][37] health.go 150: Overall health summary=&health.HealthReport{Live:true, Ready:true}\n", "stream"=>"stdout"}

The fluent-plugin-kubernetes_metadata_filter has some logic in it so that it will only attempt to parse the log field as a JSON value if it begins with '{' and ends with '}' - perhaps the parser plugin needs the same logic?

@pappnase99
Copy link

Thx for the quick answer. This is happening with the fluent/fluentd-kubernetes-daemonset:v1.3.0-debian-cloudwatch image, which uses fluent-plugin-kubernetes_metadata_filter version 2.1.4

From the error messages there seems to be some kind of recursion going on, like the same log entries being parsed over and over again. I still have no idea why this is happening. Chaining of filters shouldn't be a problem from what i've read about fluentd. AFAIK the kubernetes_metadata_filter only adds some properties under "kubernetes"!? Why is it messing with the log message? Using kubernetes_metadata with structured log parsing should be a regular use case.

@richm
Copy link
Contributor

richm commented Dec 6, 2018

The fluent-plugin-kubernetes_metadata_filter should not be messing with the log messages. We removed that feature: https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter#configuration NOTE: As of the release 2.1.x of this plugin, it no longer supports parsing the source message into JSON and attaching it to the payload.

And - I don't see any evidence that this is happening, so that's good.

I don't think this error is really a problem:
2018-12-06 09:38:26 +0000 [info]: #0 disable filter chain optimization because [Fluent::Plugin::KubernetesMetadataFilter] uses #filter_stream method.
to mean this means that fluentd cannot apply some sort of optimization.

From the error messages there seems to be some kind of recursion going on, like the same log entries being parsed over and over again.

I don't see that from what you have posted, so I'm not sure what you mean.

My question is - what happens in the @type parser plugin when the log field is not a JSON valued string?

@pappnase99
Copy link

I'm not sure how to test that when the services i want to monitor produce json logs.

@richm
Copy link
Contributor

richm commented Dec 6, 2018

record={"log"=>"2018-12-06 09:37:48.383 [INFO][37] health.go 150: Overall health summary=&health.HealthReport{Live:true, Ready:true}\n", "stream"=>"stdout"}

This is a JSON log. However:

<filter kubernetes.var.log.containers.**.log>
@type parser
key_name log
reserve_data true
remove_key_name_field true
inject_key_prefix log.
<parse>
@type json
</parse>
</filter>

This means - parse the field log in the record as a JSON field. In your case, the field log is NOT a JSON field:

"log"=>"2018-12-06 09:37:48.383 [INFO][37] health.go 150: Overall health summary=&health.HealthReport{Live:true, Ready:true}\n"

The value of the log field is NOT JSON valued: "2018-12-06 09:37:48.383 [INFO][37] health.go 150: Overall health summary=&health.HealthReport{Live:true, Ready:true}\n" is NOT a JSON string.

@pappnase99
Copy link

Ok i see the point. The services i'm interested in do have structured logging, but when my filter is applied to everything running inside the Kubernetes cluster there is going to be errors from any human-readable log files.

The proper approach would probably be to narrow down the filter pattern from this
<filter kubernetes.var.log.containers.**.log>
to a more specific matching. Problem here is that since K8S pods are volatile, you cannot simply pin it to exact file names.

@richm
Copy link
Contributor

richm commented Dec 7, 2018

ok - I'm really not sure what's going on - for debugging, what I usually do is to put code like this to intercept what the record looks like at various points in the pipeline:

<filter tag.I.am.interested.**.in>
  @type stdout
</filter>

then I can see what the record looks like before and after the filter I'm interested in. So, in your case, put a stdout filter before the kubernetes metadata filter, then after, and see if the filter is altering the record in a bad way

@pappnase99
Copy link

Thanks, i will try that one out.

@pappnase99
Copy link

As a follow-up, the problem i encountered didn't have anything to do with this plugin. My problem was that the fluentd-related events didn't get discarded into @type null as expected, that's why the log recursion happened which crashed the fluentd service.

@marufmax
Copy link

+1 having the same issue

@linbingdouzhe
Copy link

same here

@jcantrill
Copy link
Contributor

Closing issue per previous comment. Disabled filter chain should be fixed by #263

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants