Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser plugin multiline doesn't work as it appears in docs #338

Open
satrushn opened this issue Jul 1, 2021 · 4 comments
Open

Parser plugin multiline doesn't work as it appears in docs #338

satrushn opened this issue Jul 1, 2021 · 4 comments

Comments

@satrushn
Copy link

satrushn commented Jul 1, 2021

Parser plugin multiline doesn't work as it appears in docs or there is a misunderstanding.
I tried to use parser multiline in section as it appears in docs : https://docs.fluentd.org/parser/multiline
I need to collect several lines in one message.

Example of log:

[DockerLogGenerator] Multiline: 2021-07-01 12:29:42.862326529 +0000 UTC m=+107095.440406775
This is the second line
This is the third line

I expect that this log will be parsed as something like this:

record:
{
"message":"[DockerLogGenerator] Multiline: 2021-07-01 12:29:42.862326529 +0000 UTC m=+107095.440406775\n This is the second line\n This is the third line
}

But it doesn't appears.

Example of my config:

image

Would you make clear in docs, how to collect multiline logs correctly in this case, please?
Thanks.

@kenhys
Copy link
Contributor

kenhys commented Jul 2, 2021

I guess that <parse> in <source> may be what you want to do.

Here is the sample.

<source>
  @type tail
  path logs.txt
  tag test
  read_from_head true
  <parse>
     @type multiline
     format_firstline /\[/
     format1 /^(?<message>.*)/
  </parse>
</source>

<match test>
  @type stdout
</match>

OUTPUT:

2021-07-02 11:27:24 +0900 [info]: #0 following tail of logs.txt
2021-07-02 11:27:24.876553899 +0900 test: {"message":"[DockerLogGenerator] Multiline: 2021-07-01 12:29:42.862326529 +0000 UTC m=+107095.440406775\r\nThis is the second line\r\nThis is the third line"}
2021-07-02 11:27:24 +0900 [info]: #0 fluentd worker is now running worker=0

or it seems that there is a straightforward way copying from log to message by using record_transformer
(in above regex usage)

@satrushn
Copy link
Author

satrushn commented Jul 2, 2021

Thank you for your answer.
I try to use Fluentd as Fluentd-operator, there is no way to change something in "source" that's why I want to use parser in filter.
Is there another method to solve the problem?

@kenhys
Copy link
Contributor

kenhys commented Jul 29, 2021

I try to use Fluentd as Fluentd-operator, there is no way to change something in "source" that's why I want to use parser in filter.
Is there another method to solve the problem?

I do not understand well, but https://github.com/fluent-plugins-nursery/fluent-plugin-concat may help you.

@joek-office
Copy link

joek-office commented Jul 10, 2023

Hello together,
have seen this old post and it looks like that i have the same or a near problem. Hope anyone can clarify the documentation or give a way to solve my problem.

What i want to do:
I have many different files with different log formats. For this i have written a new regex to use with multiline. The jumping point here is, that i have many one line logs and multiline logs. i want to solve this like in the documentation (Java Stacktrace Log
). The problem is, that in the environment comes for multiline log output the following error message:
#0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log
and then follows the lines one by one from the log output.

'2023-07-10 14:04:31 +0200 [warn]: #0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log: "[2023-07-10 14:04:31 +0200].535 35 WARNING [-] Invalid HTTP request received.\n"
2023-07-10 14:04:31 +0200 [warn]: #0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log: "Traceback (most recent call last):\n"
2023-07-10 14:04:31 +0200 [warn]: #0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log: "  File \"/var/lib/kolla/venv/lib64/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py\", line 129, in handle_events\n"
2023-07-10 14:04:31 +0200 [warn]: #0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log: "    event = self.conn.next_event()\n"
2023-07-10 14:04:31 +0200 [warn]: #0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log: "  File \"/var/lib/kolla/venv/lib64/python3.9/site-packages/h11/_connection.py\", line 443, in next_event\n"
2023-07-10 14:04:31 +0200 [warn]: #0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log: "    exc._reraise_as_remote_protocol_error()\n"
2023-07-10 14:04:31 +0200 [warn]: #0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log: "  File \"/var/lib/kolla/venv/lib64/python3.9/site-packages/h11/_util.py\", line 76, in _reraise_as_remote_protocol_error\n"
2023-07-10 14:04:31 +0200 [warn]: #0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log: "    raise self\n"
2023-07-10 14:04:31 +0200 [warn]: #0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log: "  File \"/var/lib/kolla/venv/lib64/python3.9/site-packages/h11/_connection.py\", line 425, in next_event\n"
2023-07-10 14:04:31 +0200 [warn]: #0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log: "    event = self._extract_next_receive_event()\n"
2023-07-10 14:04:31 +0200 [warn]: #0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log: "  File \"/var/lib/kolla/venv/lib64/python3.9/site-packages/h11/_connection.py\", line 367, in _extract_next_receive_event\n"
2023-07-10 14:04:31 +0200 [warn]: #0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log: "    event = self._reader(self._receive_buffer)\n"
2023-07-10 14:04:31 +0200 [warn]: #0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log: "  File \"/var/lib/kolla/venv/lib64/python3.9/site-packages/h11/_readers.py\", line 68, in maybe_read_from_IDLE_client\n"
2023-07-10 14:04:31 +0200 [warn]: #0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log: "    raise LocalProtocolError(\"illegal request line\")\n"
2023-07-10 14:04:31 +0200 [warn]: #0 got incomplete line before first line from /var/log/kolla/skyline/skyline-error.log: "h11._util.RemoteProtocolError: illegal request line\n"'

My configuration is as follows:

\<source>
  \@ type tail
  path /var/log/kolla/*/*-access.log,/var/log/kolla/*/*-error.log,/var/log/kolla/*/*_access.log,/var/log/kolla/*/*_error.log
  pos_file /var/run/td-agent/kolla-openstack-wsgi.pos
  tag kolla.*
  enable_watch_timer false
  \<parse>
    \@ type multiline
    format_firstline /^([^\/\r\nA-Z]*?\[|\[?)(?<Timestamp>((\+)?((?<=[A-Za-z]{4})\d{2,6}|(?<![A-Za-z]{4})(\d{2,6}|\w{3}))(\/|-|:|\s|\.)?){3,7}(?=\d{4}\])\d{4})\]?/
    format1 /^([^\/\r\nA-Z]*?\[|\[?)(?<Timestamp>((\+)?((?<=[A-Za-z]{4})\d{2,6}|(?<![A-Za-z]{4})(\d{2,6}|\w{3}))(\/|-|:|\s|\.)?){3,7}(?=\d{4}\])\d{4})\]?(?<Payload>.*)/
  \</parse>
\</source>

I match the following log formats with this regex:

'10.1.101.12 - - [28/Jun/2023:08:11:12 +0200]
[Tue Jun 27 12:07:40.882366 2023]
[2023-06-28 08:36:26 +0200].455
2023/06/28 08:36:20
2023-06-27 09:49:10.619993'
the regex looks for pieces of the strings, not for the whole strings
piece: [prefix][infix][suffix] up to seven times
prefix: could be a + sign, not mandatory
infix: 2-6 digits or 3 word characters
suffix: one of /,-,:,. or space char

after the up to seven pieces it is possible to have a additional four digits if no characters are present

Can anyone help or advise why fluentd cant parse this log output as multiline and give the error message?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants