Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-p overwrites the default plugin path #4269

Open
Bonko opened this issue Aug 16, 2023 · 8 comments
Open

-p overwrites the default plugin path #4269

Bonko opened this issue Aug 16, 2023 · 8 comments
Assignees
Labels

Comments

@Bonko
Copy link

Bonko commented Aug 16, 2023

Describe the bug

When updating from fluent/fluentd-kubernetes-daemonset:v1.14.6-debian-logzio-amd64-1.0 to fluent/fluentd-kubernetes-daemonset:v1.16.2-debian-logzio-amd64-1.0 our setup stopped working.

During investigation it turned out that plugins in the default plugin directory (/etc/fluent/plugin/) are not processed anymore when using the -p parameter.

To Reproduce

Add custom pluins into /etc/fluent/plugin/ directory and ensure that they are used by the config:

root@27cf7d7bb6eb:/home/fluent# ls -1  /etc/fluent/plugin/
filter_empty_field.rb
filter_field_size.rb
filter_field_type.rb
filter_message.rb

When starting fluentd with the default options from the Dockerfile command (fluentd -c /test/fluentd.conf -p /fluentd/plugins/ --gemfile /fluentd/Gemfile) I am getting this error:

2023-08-16 10:20:01 +0000 [error]: config error file="/test/fluentd.conf" error_class=Fluent::NotFoundPluginError error="Unknown filter plugin 'message'. Run 'gem search -rd fluent-plugin' to find plugins"

I can get rid of this error by either not specifying -p at all:

fluentd -c /test/fluentd.conf  --gemfile /fluentd/Gemfile

or by adding the default plugin explicitly:

fluentd -c /test/fluentd.conf -p /fluentd/plugins/ -p /etc/fluent/plugin  --gemfile /fluentd/Gemfile

Expected behavior

The -p parameter should add plugin directories as documented without unsetting the default directory.

Your Environment

- Fluentd version: `1.16.2`
- Operating system: `Debian 11`
- Kernel version: `5.15.49-linuxkit-pr`

Your Configuration

<label @FLUENT_LOG>
  <match fluent.**>
    @type null
  </match>
</label>

<source>
  @type tail
  @id in_tail_container_logs
  path /var/log/containers/*.log
  tag "kubernetes.*"
  pos_file /var/log/containers/fluentd-containers.log.pos
  read_from_head true
  <parse>
    @type "json"
    time_format %Y-%m-%dT%H:%M:%S.%NZ
  </parse>
</source>

@include /fluentd/etc/conf.d/*.conf
[..]

/fluentd/etc/conf.d/01-pre-parsing-filters.conf:

<filter kubernetes.**>
  @type message
  message_fields log, MESSAGE
</filter>


### Your Error Log

```shell
2023-08-16 10:27:01 +0000 [info]: init supervisor logger path=nil rotate_age=nil rotate_size=nil
2023-08-16 10:27:01 +0000 [warn]: '@' is the system reserved prefix. It works in the nested configuration for now but it will be rejected: @timestamp
2023-08-16 10:27:01 +0000 [info]: parsing config file is succeeded path="/test/fluentd.conf"
2023-08-16 10:27:02 +0000 [info]: gem 'fluentd' version '1.16.2'
2023-08-16 10:27:02 +0000 [info]: gem 'fluent-plugin-concat' version '2.5.0'
2023-08-16 10:27:02 +0000 [info]: gem 'fluent-plugin-detect-exceptions' version '0.0.15'
2023-08-16 10:27:02 +0000 [info]: gem 'fluent-plugin-grok-parser' version '2.6.2'
2023-08-16 10:27:02 +0000 [info]: gem 'fluent-plugin-json-in-json-2' version '1.0.2'
2023-08-16 10:27:02 +0000 [info]: gem 'fluent-plugin-kubernetes_metadata_filter' version '3.2.0'
2023-08-16 10:27:02 +0000 [info]: gem 'fluent-plugin-logzio' version '0.2.1'
2023-08-16 10:27:02 +0000 [info]: gem 'fluent-plugin-multi-format-parser' version '1.0.0'
2023-08-16 10:27:02 +0000 [info]: gem 'fluent-plugin-parser-cri' version '0.1.1'
2023-08-16 10:27:02 +0000 [info]: gem 'fluent-plugin-prometheus' version '2.1.0'
2023-08-16 10:27:02 +0000 [info]: gem 'fluent-plugin-record-modifier' version '2.1.1'
2023-08-16 10:27:02 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.4.0'
2023-08-16 10:27:02 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.5'
2023-08-16 10:27:02 +0000 [error]: config error file="/test/fluentd.conf" error_class=Fluent::NotFoundPluginError error="Unknown filter plugin 'message'. Run 'gem search -rd fluent-plugin' to find plugins"


### Additional context

_No response_
@kenhys
Copy link
Contributor

kenhys commented Sep 4, 2023

Thank you for feedback.

Currently, it seems that actually overrides default plugin directory (reset) and it seems intended behavior.

https://github.com/fluent/fluentd/blob/master/lib/fluent/supervisor.rb#L512

As you mentioned, specify both of your specific directory and default plugin directory works.

@Bonko
Copy link
Author

Bonko commented Sep 5, 2023

Thanks for your reply @kenhys

While I get that this is intended behaviour now, the problem is that this is a behavioural change that was not documented. I think it would have make sense to highlight this as a breaking change in the release notes.
The documentation/help section also still states that -p adds a directory, omitting that this overwrites the default path:

-p, --plugin DIR                 add plugin directory

@kenhys
Copy link
Contributor

kenhys commented Sep 5, 2023

Hmm, breaking change is not desirable, it seems that expecting not overriding default path is reasonable. 🤔

@kenhys
Copy link
Contributor

kenhys commented Sep 5, 2023

with further investigation, it seems that regression since v1.16.0.

Fix problem that some system configs are not reflected #4064 #4065 #4086 #4090 #4096

It maybe side effect of above fix.

At least v1.15.3 seems work. (it does not reset default plugin path)

@kenhys kenhys added the bug label Sep 5, 2023
@daipom
Copy link
Contributor

daipom commented Sep 5, 2023

I'm sorry.
This change breaks the specification.

https://github.com/fluent/fluentd/pull/4064/files#diff-fe3c9c8a2fe3872b84e0de1d14ecd669258b0e57609c470b601bdfabedb91224R49

This should be fixed.

@daipom daipom self-assigned this Sep 5, 2023
@mrudrego
Copy link
Contributor

Hi @daipom , i was looking at this issue to see what could be the fix.

There are 2 hash that is used, default_opts is used for storing the default configuration, while cmd_opts is used to store user provided command line options. Later these 2 hashes are merged using the merge function . As a result of this merge function, the plugin_dirs key value available in default_opts is overwritten by the value in cmd_opts. If the old behaviour has to be continued, then values of the key(plugin_dirs) in the 2 hashes should have been concatenated instead of overwriting.

The fix which i could think of is using concat function to join the content of plugin_dirs key array values after the 2 hashes are merged at line
https://github.com/fluent/fluentd/blob/master/lib/fluent/command/fluentd.rb#L251C29-L251C29

opts[:plugin_dirs].concat(default_opts[:plugin_dirs])

Please let me know if this looks ok or is there a better option.

Thanks,

@mrudrego
Copy link
Contributor

@daipom, can you please let me know if the approach looks ok or is there any other suggestion for the fix. Based on this i can raise a PR.

Thanks,

@daipom
Copy link
Contributor

daipom commented Oct 30, 2023

@mrudrego Thanks! Sorry for my late response.
Currently, I'm busy on checking the issues related to #3614, so my response may be slow for a while. Sorry.

The fix for this issue would be difficult.
The implementation around this is very complicated. 😢
It is mainly because we have 3 different option values: default option; command line option; system config.
In addition, some options are used in the command script fluentd.rb, and some options are used in the Fluentd processes supervisor.rb.

Actually, it's not opts that is passed to Supervisor, but cmd_opts.

supervisor = Fluent::Supervisor.new(cmd_opts)
supervisor.configure(supervisor: true)
supervisor.run_supervisor(dry_run: opts[:dry_run])
else
if opts[:standalone_worker] && opts[:workers] && opts[:workers] > 1
puts "Error: multi workers is not supported with --no-supervisor"
exit 2
end
worker = Fluent::Supervisor.new(cmd_opts)
worker.configure

It is merged here. 😢

def initialize(cl_opt)
@cl_opt = cl_opt
opt = self.class.default_options.merge(cl_opt)
@config_file_type = opt[:config_file_type]
@daemonize = opt[:daemonize]
@standalone_worker= opt[:standalone_worker]
@config_path = opt[:config_path]
@inline_config = opt[:inline_config]
@use_v1_config = opt[:use_v1_config]
@conf_encoding = opt[:conf_encoding]
@log_path = opt[:log_path]
@show_plugin_config = opt[:show_plugin_config]
@libs = opt[:libs]
@plugin_dirs = opt[:plugin_dirs]
@chgroup = opt[:chgroup]
@chuser = opt[:chuser]
@chumask = opt[:chumask]
@signame = opt[:signame]
# TODO: `@log_rotate_age` and `@log_rotate_size` should be removed
# since it should be merged with SystemConfig in `build_system_config()`.
# We should always use `system_config.log.rotate_age` and `system_config.log.rotate_size`.
# However, currently, there is a bug that `system_config.log` parameters
# are not in `Fluent::SystemConfig::SYSTEM_CONFIG_PARAMETERS`, and these
# parameters are not merged in `build_system_config()`.
# Until we fix the bug of `Fluent::SystemConfig`, we need to use these instance variables.
@log_rotate_age = opt[:log_rotate_age]
@log_rotate_size = opt[:log_rotate_size]
@finished = false
end

So, I think this fix is very difficult. It would require knowledge of the entire Fluentd startup logic.

I don't think this fix will be done in time for the next release v1.16.3.
We are planning to release v1.16.3 very soon for fixing #3614.
After v1.16.3 is released, I will think about how we can fix this.

Of course, if you have any good ideas, please let us know! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: To-Do
Development

No branches or pull requests

4 participants