Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Encountered During Hot Reload in Fluent Bit 3.0.2 #8817

Open
im-jinxinwang opened this issue May 11, 2024 · 4 comments
Open

Error Encountered During Hot Reload in Fluent Bit 3.0.2 #8817

im-jinxinwang opened this issue May 11, 2024 · 4 comments
Labels
status: waiting-for-triage waiting-for-user Waiting for more information, tests or requested changes

Comments

@im-jinxinwang
Copy link

Bug Report

Describe the bug

When attempting to perform a hot reload with Fluent Bit version 3.0.2, the following error is encountered:

[2024/05/11 16:19:17] [error] [multiline] parser 'exception_test' not registered
[2024/05/11 16:19:17] [error] [input:tail:tail.0] could not load multiline parsers
[2024/05/11 16:19:17] [error] failed initialize input tail.0
[2024/05/11 16:19:17] [error] [engine] input initialization failed
[2024/05/11 16:19:18] [error] [reload] loaded configuration contains error(s). Reloading is aborted

fluent-bit.conf

[SERVICE]
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_PORT    38085
    Hot_Reload   On
    Flush        1
    Grace        5
    Daemon       Off
    Log_Level    info
    Parsers_File parsers.conf
[INPUT]
    Name         tail
    Tag              foundation-audit
    DB  /fluent-bit/log/tail-containers-state.db
    Read_from_Head true
    Skip_Empty_Lines true
    Path  /tmp/test*
    Path_Key  pod_log_path
    multiline.parser exception_test
[OUTPUT]
    Name        kafka
    Match       *
    Brokers     192.168.123.50:9092
    Topics      testlog

parsers.conf

[MULTILINE_PARSER]
    name          exception_test
    type          regex
    flush_timeout 1000
    rule          "start_state"  "/(Dec \d+ \d+\:\d+\:\d+)(.*)/" "cont"
    rule          "cont" "/^\s+at.*/" "cont"

To Reproduce

  • Rubular link if applicable:

  • Example log message if applicable:

  • Steps to reproduce the problem:

Expected behavior

Screenshots

image
image

Your Environment

  • Version used: 3.0.2
  • Configuration:
  • Environment name and version (e.g. Kubernetes? What version?):
  • Server type and version:
  • Operating System and version:
  • Filters and plugins:

Additional context

@lecaros
Copy link
Contributor

lecaros commented May 11, 2024

Hi @im-jinxinwang,
is this reproducible for you building from master ?

I just tried it, and it works (minus the kafka, but not issue hot-reloading).

fluent-bit -c 8817.conf
Fluent Bit v3.0.4
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

___________.__                        __    __________.__  __          ________  
\_   _____/|  |  __ __   ____   _____/  |_  \______   \__|/  |_  ___  _\_____  \ 
 |    __)  |  | |  |  \_/ __ \ /    \   __\  |    |  _/  \   __\ \  \/ / _(__  < 
 |     \   |  |_|  |  /\  ___/|   |  \  |    |    |   \  ||  |    \   / /       \
 \___  /   |____/____/  \___  >___|  /__|    |______  /__||__|     \_/ /______  /
     \/                     \/     \/               \/                        \/ 

[2024/05/11 17:29:25] [ info] [fluent bit] version=3.0.4, commit=ce7aafab40, pid=39607
[2024/05/11 17:29:25] [ info] [storage] ver=1.1.6, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/05/11 17:29:25] [ info] [cmetrics] version=0.9.0
[2024/05/11 17:29:25] [ info] [ctraces ] version=0.5.1
[2024/05/11 17:29:25] [ info] [input:tail:tail.0] initializing
[2024/05/11 17:29:25] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
[2024/05/11 17:29:25] [ info] [input:tail:tail.0] multiline core started
[2024/05/11 17:29:25] [ info] [output:kafka:kafka.0] brokers='192.168.123.50:9092' topics='testlog'
[2024/05/11 17:29:25] [ info] [http_server] listen iface=0.0.0.0 tcp_port=38085
[2024/05/11 17:29:25] [ info] [sp] stream processor started
[2024/05/11 17:29:32] [engine] caught signal (SIGHUP)
[2024/05/11 17:29:32] [ info] reloading instance pid=39607 tid=0x2053d3ac0
[2024/05/11 17:29:32] [ info] [reload] stop everything of the old context
[2024/05/11 17:29:32] [ info] [input] pausing tail.0
[2024/05/11 17:29:33] [ info] [reload] start everything
[2024/05/11 17:29:33] [ info] [fluent bit] version=3.0.4, commit=ce7aafab40, pid=39607
[2024/05/11 17:29:33] [ info] [storage] ver=1.1.6, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/05/11 17:29:33] [ info] [cmetrics] version=0.9.0
[2024/05/11 17:29:33] [ info] [ctraces ] version=0.5.1
[2024/05/11 17:29:33] [ info] [input:tail:tail.0] initializing
[2024/05/11 17:29:33] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
[2024/05/11 17:29:33] [ info] [input:tail:tail.0] multiline core started
[2024/05/11 17:29:33] [ info] [output:kafka:kafka.0] brokers='192.168.123.50:9092' topics='testlog'
[2024/05/11 17:29:33] [ info] [http_server] listen iface=0.0.0.0 tcp_port=38085
[2024/05/11 17:29:33] [ info] [sp] stream processor started
[2024/05/11 17:30:03] [ warn] [output:kafka:kafka.0] fluent-bit#producer-2: [thrd:192.168.123.50:9092/bootstrap]: 192.168.123.50:9092/bootstrap: Connection setup timed out in state CONNECT (after 30026ms in state CONNECT)
[2024/05/11 17:30:03] [error] [output:kafka:kafka.0] fluent-bit#producer-2: [thrd:192.168.123.50:9092/bootstrap]: 1/1 brokers are down
[2024/05/11 17:30:34] [ warn] [output:kafka:kafka.0] fluent-bit#producer-2: [thrd:192.168.123.50:9092/bootstrap]: 192.168.123.50:9092/bootstrap: Connection setup timed out in state CONNECT (after 30025ms in state CONNECT, 1 identical error(s) suppressed)
```

@lecaros lecaros added the waiting-for-user Waiting for more information, tests or requested changes label May 11, 2024
@im-jinxinwang
Copy link
Author

Hi @lecaros
I see that the version you started is different from mine. I have the same problem in both versions 3.0.2 and 3.0.3 that hot reloading is not possible.

@im-jinxinwang
Copy link
Author

I compiled the latest code, and 3.0.4 does not have this problem, but 3.0.4 has not been released.

@hatmen
Copy link

hatmen commented May 25, 2024

Hi @lecaros
When using Kafka output and hot loading, I also encountered the same problem and verified that 2.2.2, 3.0.3, 3.0.4, and 3.0.5 all had similar issues.

fluent-bit.conf

[SERVICE]
    flush 10
    daemon Off
    log_level trace
    Hot_Reload   On
    HTTP_Listen  0.0.0.0
    HTTP_PORT    2020
    HTTP_Server  On
    parsers_file /etc/fluent-bit/parsers.conf
    plugins_file /etc/fluent-bit/plugins.conf

[INPUT]
        Name random
        Tag test4
        Samples 10

[FILTER]
        Name modify
        Match test4
        Add flb_topic flb-baron-test

[OUTPUT]
    Name        kafka
    Match       *
    Brokers     192.168.7.10:9092,192.168.7.11:9092
    Topics      fluent-logs
    topic_key   flb_topic
    Dynamic_Topic on
  

Describe the bug

[2024/05/25 03:31:25] [ info] [sp] stream processor started
[2024/05/25 03:31:26] [trace] [filter:modify:modify.0 at /tmp/fluent-bit/plugins/filter_modify/modify.c:1426] Input map size 1 elements, output map size 2 elements
[2024/05/25 03:31:26] [debug] [input chunk] update output instances with new chunk size diff=67, records=1, input=random.0
[2024/05/25 03:31:26] [engine] caught signal (SIGHUP)
[2024/05/25 03:31:26] [ info] reloading instance pid=3163754 tid=0x7f4391275340
[2024/05/25 03:31:26] [ info] [reload] stop everything of the old context
[2024/05/25 03:31:26] [trace] [engine] flush enqueued data
[2024/05/25 03:31:26] [trace] [task 0x7f438f836c80] created (id=0)
[2024/05/25 03:31:26] [debug] [task] created task=0x7f438f836c80 id=0 OK
[2024/05/25 03:31:26] [ warn] [engine] service will shutdown when all remaining tasks are flushed
[2024/05/25 03:31:26] [ info] [input] pausing random.0
{"[2024/05/25 03:31:26] [debug] in produce_message

rand_value"=>10952961381925857816, "flb_topic"=>"flb-baron-test"}[2024/05/25 03:31:26] [ info] [out_kafka] new topic added: flb-baron-test
[2024/05/25 03:31:26] [debug] [output:kafka:kafka.0] enqueued message (95 bytes) for topic 'flb-baron-test'
[2024/05/25 03:31:26] [trace] [engine] [task event] task_id=0 out_id=0 return=OK
[2024/05/25 03:31:26] [debug] [out flush] cb_destroy coro_id=0
[2024/05/25 03:31:26] [trace] [coro] destroy coroutine=0x7f438f80bc38 data=0x7f438f80bc50
[2024/05/25 03:31:26] [debug] [task] destroy task=0x7f438f836c80 (task_id=0)
[2024/05/25 03:31:27] [ info] [engine] service has stopped (0 pending tasks)
[2024/05/25 03:31:27] [ info] [input] pausing random.0
[2024/05/25 03:31:27] [debug] [output:kafka:kafka.0] message delivered (95 bytes, partition 0)
[2024/05/25 03:31:27] [engine] caught signal (SIGSEGV)
#0  0x4913b6            in  ???() at ???:0
#1  0x526459            in  ???() at ???:0
#2  0x527732            in  ???() at ???:0
#3  0x513cbb            in  ???() at ???:0
#4  0x513d3d            in  ???() at ???:0
#5  0x54d52a            in  ???() at ???:0
#6  0x54f2ef            in  ???() at ???:0
#7  0x54f403            in  ???() at ???:0
#8  0x57763d            in  ???() at ???:0
#9  0x577270            in  ???() at ???:0
#10 0x51866b            in  ???() at ???:0
#11 0x7f439109f7f1      in  ???() at ???:0
#12 0x7f439103f44f      in  ???() at ???:0
#13 0xffffffffffffffff  in  ???() at ???:0
Aborted (core dumped)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: waiting-for-triage waiting-for-user Waiting for more information, tests or requested changes
Projects
None yet
Development

No branches or pull requests

3 participants