Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

out_loki: allow sending unquoted strings #8814

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

iandrewt
Copy link
Contributor

@iandrewt iandrewt commented May 10, 2024

This patch adds a third value to drop_single_key - raw, which allows sending unquoted strings to Loki when using JSON as the line_format.

While yes, for the output to be valid JSON, quotes would be expected, Loki does not support reading a plain quoted string with its JSON parser, complaining that it cannot find a } character. Instead, you need to use a combination of regexp and line_format expressions to unquote the log before running any other parsers over it.

By adding a third value of raw, this ensures backwards compatibility for anyone that is already relying on the existing behaviour.

An example query before this change for plaintext logs, to remove the quotes:

{"job"="fluent-bit"} | regexp `^"?(?S<log>.*?)"?$` | line_format "{{.log}}"

This is necessary to add before any other parsing for non-JSON logs, such as using Loki's logfmt parser, as it will otherwise drop the first and last key/value pair due to the extraneous quotes. Given Loki by design does not care about the input format (there's a reason JSON parsing is optional in the first place!), I do think this should be the default behavior some day, but given the breaking nature of the change, having it as an option for now should be fine.

Addresses #4353, #3005


Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
[SERVICE]
    flush     1
    log_level trace

[INPUT]
    name      dummy
    dummy     {"key": "value"}

[OUTPUT]
    name                   loki
    match                  *
    host                   127.0.0.1
    port                   3100
    drop_single_key        raw
  • Debug log output from testing the change
    Note that I sent this to a fake http server instead of a Loki instance for testing, hence the 200 response. The HTTP body my server received was this:
{"streams":[{"stream":{"job":"fluent-bit"},"values":[["1715344552991602966","value"]]}]}
Fluent Bit v3.0.4
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

___________.__                        __    __________.__  __          ________  
\_   _____/|  |  __ __   ____   _____/  |_  \______   \__|/  |_  ___  _\_____  \ 
 |    __)  |  | |  |  \_/ __ \ /    \   __\  |    |  _/  \   __\ \  \/ / _(__  < 
 |     \   |  |_|  |  /\  ___/|   |  \  |    |    |   \  ||  |    \   / /       \
 \___  /   |____/____/  \___  >___|  /__|    |______  /__||__|     \_/ /______  /
     \/                     \/     \/               \/                        \/ 

[2024/05/10 22:35:52] [ info] Configuration:
[2024/05/10 22:35:52] [ info]  flush time     | 1.000000 seconds
[2024/05/10 22:35:52] [ info]  grace          | 5 seconds
[2024/05/10 22:35:52] [ info]  daemon         | 0
[2024/05/10 22:35:52] [ info] ___________
[2024/05/10 22:35:52] [ info]  inputs:
[2024/05/10 22:35:52] [ info]      dummy
[2024/05/10 22:35:52] [ info] ___________
[2024/05/10 22:35:52] [ info]  filters:
[2024/05/10 22:35:52] [ info] ___________
[2024/05/10 22:35:52] [ info]  outputs:
[2024/05/10 22:35:52] [ info]      loki.0
[2024/05/10 22:35:52] [ info] ___________
[2024/05/10 22:35:52] [ info]  collectors:
[2024/05/10 22:35:52] [ info] [fluent bit] version=3.0.4, commit=41ef155add, pid=140071
[2024/05/10 22:35:52] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2024/05/10 22:35:52] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/05/10 22:35:52] [ info] [cmetrics] version=0.9.0
[2024/05/10 22:35:52] [ info] [ctraces ] version=0.5.1
[2024/05/10 22:35:52] [ info] [input:dummy:dummy.0] initializing
[2024/05/10 22:35:52] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2024/05/10 22:35:52] [debug] [dummy:dummy.0] created event channels: read=21 write=22
[2024/05/10 22:35:52] [debug] [loki:loki.0] created event channels: read=23 write=24
[2024/05/10 22:35:52] [ info] [output:loki:loki.0] configured, hostname=127.0.0.1:3100
[2024/05/10 22:35:52] [ info] [sp] stream processor started
[2024/05/10 22:35:52] [trace] [input chunk] update output instances with new chunk size diff=32, records=1, input=dummy.0
[2024/05/10 22:35:53] [trace] [task 0x7dae40019030] created (id=0)
[2024/05/10 22:35:53] [debug] [task] created task=0x7dae40019030 id=0 OK
[2024/05/10 22:35:53] [trace] [upstream] get new connection for 127.0.0.1:3100, net setup:
net.connect_timeout        = 10 seconds
net.source_address         = any
net.keepalive              = enabled
net.keepalive_idle_timeout = 30 seconds
net.max_worker_connections = 0
[2024/05/10 22:35:53] [trace] [net] connection #31 in process to 127.0.0.1:3100
[2024/05/10 22:35:53] [trace] [engine] resuming coroutine=0x7dae40019150
[2024/05/10 22:35:53] [trace] [io] connection OK
[2024/05/10 22:35:53] [debug] [upstream] KA connection #31 to 127.0.0.1:3100 is connected
[2024/05/10 22:35:53] [debug] [http_client] not using http_proxy for header
[2024/05/10 22:35:53] [trace] [io coro=0x7dae40019150] [net_write] trying 157 bytes
[2024/05/10 22:35:53] [trace] [io coro=0x7dae40019150] [fd 31] write_async(2)=157 (157/157)
[2024/05/10 22:35:53] [trace] [io coro=0x7dae40019150] [net_write] ret=157 total=157/157
[2024/05/10 22:35:53] [trace] [io coro=0x7dae40019150] [net_write] trying 88 bytes
[2024/05/10 22:35:53] [trace] [io coro=0x7dae40019150] [fd 31] write_async(2)=88 (88/88)
[2024/05/10 22:35:53] [trace] [io coro=0x7dae40019150] [net_write] ret=88 total=88/88
[2024/05/10 22:35:53] [trace] [io coro=0x7dae40019150] [net_read] try up to 4095 bytes
[2024/05/10 22:35:53] [trace] [engine] resuming coroutine=0x7dae40019150
[2024/05/10 22:35:53] [trace] [io coro=0x7dae40019150] [net_read] ret=124
[2024/05/10 22:35:53] [debug] [output:loki:loki.0] 127.0.0.1:3100, HTTP status=200
accepted
[2024/05/10 22:35:53] [debug] [upstream] KA connection #31 to 127.0.0.1:3100 is now available
[2024/05/10 22:35:53] [trace] [engine] [task event] task_id=0 out_id=0 return=OK
[2024/05/10 22:35:53] [debug] [out flush] cb_destroy coro_id=0
[2024/05/10 22:35:53] [trace] [coro] destroy coroutine=0x7dae40019150 data=0x7dae40019170
[2024/05/10 22:35:53] [debug] [task] destroy task=0x7dae40019030 (task_id=0)
^C[2024/05/10 22:35:56] [engine] caught signal (SIGINT)
[2024/05/10 22:35:56] [trace] [engine] flush enqueued data
[2024/05/10 22:35:56] [ warn] [engine] service will shutdown in max 5 seconds
[2024/05/10 22:35:56] [ info] [input] pausing dummy.0
[2024/05/10 22:35:56] [ info] [engine] service has stopped (0 pending tasks)
[2024/05/10 22:35:56] [ info] [input] pausing dummy.0
[2024/05/10 22:35:56] [trace] [upstream] destroy connection #31 to 127.0.0.1:3100
  • Attached Valgrind output that shows no leaks or memory corruption was found
==140694== Memcheck, a memory error detector
==140694== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==140694== Using Valgrind-3.23.0 and LibVEX; rerun with -h for copyright info
==140694== Command: ./bin/fluent-bit -c fluent-bit.conf
==140694== 
...
==140694== 
==140694== HEAP SUMMARY:
==140694==     in use at exit: 0 bytes in 0 blocks
==140694==   total heap usage: 1,769 allocs, 1,769 frees, 806,933 bytes allocated
==140694== 
==140694== All heap blocks were freed -- no leaks are possible
==140694== 
==140694== For lists of detected and suppressed errors, rerun with: -s
==140694== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

fluent/fluent-bit-docs#1368

Backporting

  • [N/A] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

This patch adds a third value to `drop_single_key` - `raw`, which allows
sending unquoted strings to Loki when using JSON as the `line_format`.

While yes, for the output to be valid JSON, quotes would be expected,
Loki does not support reading a plain quoted string with its JSON
parser, complaining that it cannot find a `}` character.
Instead, you need to use a combination of regexp and line_format
expressions to unquote the log before running any other parsers over it.

By adding a third value of `raw`, this ensures backwards compatibility
for anyone that is already relying on the existing behaviour.

Signed-off-by: Andrew Titmuss <iandrewt@icloud.com>
@iandrewt
Copy link
Contributor Author

I'm undecided on whether raw or plain should be the keyword for this behavior - happy for maintainers to change it if there's strong opinions one way or the other. I'll wait until there's consensus on that before doing the documentation PR

@patrick-stephens
Copy link
Contributor

Can you link the docs PR once you have it too?

iandrewt added a commit to iandrewt/fluent-bit-docs that referenced this pull request May 10, 2024
Relates to fluent/fluent-bit#8814

Signed-off-by: Andrew Titmuss <iandrewt@icloud.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants