You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
fluent-bit with "retry_limit False" output plugin keeps trying to flush chunks seemingly forever.
To Reproduce
Configure fluent-bit with working OTEL input plugin and wrongly configured OTEL output plugin, so that it cannot flush its chunks.
Generate some OTEL data for fluent-bit.
Alter the configuration with correct values, and issue a reload via SIGHUP or HTTP API.
The log below show what happens after fluent-bit received input data and a reload is issued.
[2024/05/03 11:02:17] [engine] caught signal (SIGHUP)
[2024/05/03 11:02:17] [ info] reloading instance pid=19707 tid=0x7fa4aef010
[2024/05/03 11:02:17] [ info] [reload] stop everything of the old context
[2024/05/03 11:02:17] [ warn] [engine] service will shutdown when all remaining tasks are flushed
[2024/05/03 11:02:17] [debug] [engine] re-scheduled retry=0x7f9c066090 for task 0
[2024/05/03 11:02:17] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:17] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:17] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:17] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:17] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:17] [ info] [task] opentelemetry/opentelemetry.0 has 1 pending task(s):
[2024/05/03 11:02:17] [ info] [task] task_id=0 still running on route(s): opentelemetry/opentelemetry.0
[2024/05/03 11:02:17] [ info] [task] storage_backlog/storage_backlog.1 has 0 pending task(s):
[2024/05/03 11:02:17] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:17] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:17] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:17] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:17] [debug] [out flush] cb_destroy coro_id=3
[2024/05/03 11:02:17] [debug] [retry] re-using retry for task_id=0 attempts=4
[2024/05/03 11:02:17] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:18] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:18] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:18] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:18] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:18] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:18] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:18] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:18] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:18] [debug] [out flush] cb_destroy coro_id=4
[2024/05/03 11:02:18] [debug] [retry] re-using retry for task_id=0 attempts=5
[2024/05/03 11:02:18] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:19] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:19] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:19] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:19] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:19] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:19] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:19] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:19] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:19] [debug] [out flush] cb_destroy coro_id=5
[2024/05/03 11:02:19] [debug] [retry] re-using retry for task_id=0 attempts=6
[2024/05/03 11:02:19] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:20] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:20] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:20] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:20] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:20] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:20] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:20] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:20] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:20] [debug] [out flush] cb_destroy coro_id=6
[2024/05/03 11:02:20] [debug] [retry] re-using retry for task_id=0 attempts=7
[2024/05/03 11:02:20] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:21] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:21] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:21] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:21] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:21] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:21] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:21] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:21] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:21] [debug] [out flush] cb_destroy coro_id=7
[2024/05/03 11:02:21] [debug] [retry] re-using retry for task_id=0 attempts=8
[2024/05/03 11:02:21] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:22] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:22] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:22] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:22] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:22] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:22] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:22] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:22] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:22] [debug] [out flush] cb_destroy coro_id=8
[2024/05/03 11:02:22] [debug] [retry] re-using retry for task_id=0 attempts=9
[2024/05/03 11:02:22] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:23] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:23] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:23] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:23] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:23] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:23] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:23] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:23] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:23] [debug] [out flush] cb_destroy coro_id=9
[2024/05/03 11:02:23] [debug] [retry] re-using retry for task_id=0 attempts=10
[2024/05/03 11:02:23] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:24] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:24] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:24] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:24] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:24] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:24] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:24] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:24] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:24] [debug] [out flush] cb_destroy coro_id=10
[2024/05/03 11:02:24] [debug] [retry] re-using retry for task_id=0 attempts=11
[2024/05/03 11:02:24] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:25] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:25] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:25] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:25] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:25] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:25] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:25] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:25] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:25] [debug] [out flush] cb_destroy coro_id=11
[2024/05/03 11:02:25] [debug] [retry] re-using retry for task_id=0 attempts=12
[2024/05/03 11:02:25] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:26] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:26] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:26] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:26] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:26] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:26] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:26] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:26] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:26] [debug] [out flush] cb_destroy coro_id=12
[2024/05/03 11:02:26] [debug] [retry] re-using retry for task_id=0 attempts=13
[2024/05/03 11:02:26] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:27] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:27] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:27] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:27] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:27] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:27] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:27] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:27] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:27] [debug] [out flush] cb_destroy coro_id=13
[2024/05/03 11:02:27] [debug] [retry] re-using retry for task_id=0 attempts=14
[2024/05/03 11:02:27] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:28] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:28] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:28] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:28] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:28] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:28] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:28] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:28] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:28] [debug] [out flush] cb_destroy coro_id=14
[2024/05/03 11:02:28] [debug] [retry] re-using retry for task_id=0 attempts=15
[2024/05/03 11:02:28] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:29] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:29] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:29] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:29] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:29] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:29] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:29] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:29] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:29] [debug] [out flush] cb_destroy coro_id=15
[2024/05/03 11:02:29] [debug] [retry] re-using retry for task_id=0 attempts=16
[2024/05/03 11:02:29] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:30] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:30] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:30] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:30] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:30] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:30] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:30] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:30] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:30] [debug] [out flush] cb_destroy coro_id=16
[2024/05/03 11:02:30] [debug] [retry] re-using retry for task_id=0 attempts=17
[2024/05/03 11:02:30] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:31] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:31] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:31] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:31] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:31] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:31] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:31] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:31] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:31] [debug] [out flush] cb_destroy coro_id=17
[2024/05/03 11:02:31] [debug] [retry] re-using retry for task_id=0 attempts=18
[2024/05/03 11:02:31] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:32] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:32] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:32] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:32] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:32] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:32] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:32] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:32] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:32] [debug] [out flush] cb_destroy coro_id=18
[2024/05/03 11:02:32] [debug] [retry] re-using retry for task_id=0 attempts=19
[2024/05/03 11:02:32] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:33] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:33] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:33] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:33] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:33] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:33] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:33] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:33] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:33] [debug] [out flush] cb_destroy coro_id=19
[2024/05/03 11:02:33] [debug] [retry] re-using retry for task_id=0 attempts=20
[2024/05/03 11:02:33] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:34] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:34] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:34] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:34] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:34] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:34] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:34] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:34] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:34] [debug] [out flush] cb_destroy coro_id=20
[2024/05/03 11:02:34] [debug] [retry] re-using retry for task_id=0 attempts=21
[2024/05/03 11:02:34] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:35] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:35] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:35] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:35] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:35] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:35] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:35] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:35] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:35] [debug] [out flush] cb_destroy coro_id=21
[2024/05/03 11:02:35] [debug] [retry] re-using retry for task_id=0 attempts=22
[2024/05/03 11:02:35] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:36] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:36] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:36] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:36] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:36] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:36] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:36] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:36] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:36] [debug] [out flush] cb_destroy coro_id=22
[2024/05/03 11:02:36] [debug] [retry] re-using retry for task_id=0 attempts=23
[2024/05/03 11:02:36] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:37] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:37] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:37] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:37] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:37] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:37] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:37] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:37] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:37] [debug] [out flush] cb_destroy coro_id=23
[2024/05/03 11:02:37] [debug] [retry] re-using retry for task_id=0 attempts=24
[2024/05/03 11:02:37] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
[2024/05/03 11:02:38] [ info] [input] pausing storage_backlog.1
[2024/05/03 11:02:38] [debug] [output:opentelemetry:opentelemetry.0] ctraces msgpack size: 602
[2024/05/03 11:02:38] [debug] [output:opentelemetry:opentelemetry.0] final payload size: 286
[2024/05/03 11:02:38] [debug] [upstream] KA connection #29 to xxx.xxx:443 has been assigned (recycled)
[2024/05/03 11:02:38] [debug] [http_client] not using http_proxy for header
[2024/05/03 11:02:38] [error] [output:opentelemetry:opentelemetry.0] xxx.xxx:443, HTTP status=401
[2024/05/03 11:02:38] [debug] [upstream] KA connection #29 to xxx.xxx:443 is now available
[2024/05/03 11:02:38] [debug] [output:opentelemetry:opentelemetry.0] http_post result FLB_RETRY
[2024/05/03 11:02:38] [debug] [out flush] cb_destroy coro_id=24
[2024/05/03 11:02:38] [debug] [retry] re-using retry for task_id=0 attempts=25
[2024/05/03 11:02:38] [ warn] [engine] failed to flush chunk '19707-1714734107.667106355.flb', retry in 1 seconds: task_id=0, input=opentelemetry.0 > output=opentelemetry.0 (out_id=0)
Expected behavior
fluent-bit should interrupt flushing output plugins making it possible to reload configuration if it was wrongly configured or the information needs updated.
Screenshots
Your Environment
Version used: 3.0.3
Configuration:
[SERVICE]
HTTP_Server Off
Hot_Reload On
Log_Level debug
flush 1
storage.path /var/fluent_bit
storage.sync normal
storage.checksum off
storage.max_chunks_up 32
storage.backlog.mem_limit 5M
storage.delete_irrecoverable_chunks off
[FILTER]
Name throttle
Match *
Rate 1
Window 3
Interval 3s
[INPUT]
name opentelemetry
storage.type filesystem
listen 127.0.0.1
port 4318
raw_traces false
successful_response_code 200
storage.type filesystem
# storage.pause_on_chunks_overlimit on
Mem_Buf_Limit 1M
[OUTPUT]
name opentelemetry
match *
storage.total_limit_size 64M
host xxx.xxx
port 443
Metrics_uri /opentelemetry/v1/metrics
Logs_uri /opentelemetry/v1/logs
Traces_uri /opentelemetry/v1/traces
Log_response_payload False
Retry_Limit False
tls True
tls.ca_file /etc/ssl/host.pem
tls.crt_file /etc/ssl/cert.pem
tls.key_file /etc/ssl/private/priv.key
Environment name and version (e.g. Kubernetes? What version?):
Native on operating system
Server type and version:
Specialized ARM hardware running fluent-bit along with other daemons.
Operating System and version:
ptxdist Linux 5.4
Filters and plugins:
see config above
Additional context
I have chosen to use retry_limit False, because I need to save OTEL data for a potentially long time, maybe 3-4 weeks. The file system buffering aids me in this, but if I don't set retry_limit False, telemetry data could be deleted just because of unstable connection or no connections at all. The vessel containing the ARM device could be out of ISP for "long" time
The text was updated successfully, but these errors were encountered:
Bug Report
Describe the bug
fluent-bit with "retry_limit False" output plugin keeps trying to flush chunks seemingly forever.
To Reproduce
The log below show what happens after fluent-bit received input data and a reload is issued.
Expected behavior
fluent-bit should interrupt flushing output plugins making it possible to reload configuration if it was wrongly configured or the information needs updated.
Screenshots
Your Environment
Native on operating system
Specialized ARM hardware running fluent-bit along with other daemons.
ptxdist Linux 5.4
see config above
Additional context
I have chosen to use retry_limit False, because I need to save OTEL data for a potentially long time, maybe 3-4 weeks. The file system buffering aids me in this, but if I don't set retry_limit False, telemetry data could be deleted just because of unstable connection or no connections at all. The vessel containing the ARM device could be out of ISP for "long" time
The text was updated successfully, but these errors were encountered: