Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data ingestion throws 400 - Rejected by OpenSearch #82

Open
2 tasks done
leowinterde opened this issue Oct 25, 2022 · 5 comments
Open
2 tasks done

Data ingestion throws 400 - Rejected by OpenSearch #82

leowinterde opened this issue Oct 25, 2022 · 5 comments

Comments

@leowinterde
Copy link

leowinterde commented Oct 25, 2022

  • read the contribution guideline
  • (optional) already reported 3rd party upstream repository or mailing list if you use k8s addon or helm charts.

Steps to replicate

Our log pipeline:
FluentBit --> FluentD --> OpenSearch

FluentBit Config:

SERVICE]
    flush               5
    daemon              Off
    log_level           info
    parsers_file        parsers.conf
    plugins_file        plugins.conf
    http_server         Off
[INPUT]
    Name                winevtlog
    Channels            Setup,Windows PowerShell,System,Security,Application,Microsoft-Windows-Sysmon/Operational,Microsoft-Windows-TerminalServices-LocalSessionManager/Operational,Microsoft-Windows-WMI-Activity/Operational
    Interval_Sec        1
    DB                  winlog.sqlite
    String_Inserts      False
    Render_Event_As_XML True
    Use_ANSI            True
    Tag                 log
[FILTER]
    Name                record_modifier
    Match               *
    Record hostname     ${HOSTNAME}
    Record log-os       windows
    Record log-app      test
[OUTPUT]
    Name                forward
    Match               log
    Upstream            forward-log

FluentD Config:

# Log FluentBit to FluentD input
<source>
  @type forward
  @id input-forward-log
  port 24224
  bind 0.0.0.0
  require_ack_response true
  tag log
  source_address_key client_ip
  source_hostname_key client_hostname
  <transport tls>
    cert_path /PATH/X.cer
    private_key_path /PATH/X.key
  </transport>
</source>

## Save Logs
<match log.**>
    @type opensearch
    @id output-os
    hosts HOST1:9200,HOST2:9200,HOST3:9200
    ca_file /PATH/os-ca.pem
    client_cert /PATH/os-client.pem
    client_key /PATH/os-client-key.pem
    #client_key_pass password
    scheme https
    user admin
    password admin
    logstash_format true
    log_os_400_reason true
    reconnect_on_error true
    reload_on_failure true
    reload_connections false
    <buffer>
      @type file
      path  /PATH/opensearch
    </buffer>
</match>
...

Error inside the Logs:

2022-10-25 15:01:05 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::OpenSearchErrorHandler::OpenSearchError error="400 - Rejected by OpenSearch [error type]: mapper_parsing_exception [reason]: 'failed to parse field [System] of type [text] in document with id 'ThapD4QBCUjyQn6TS8oK'. Preview of field's value: '''" location=nil tag="log" time=2022-10-25 15:00:55.585218200 +0000 record={"System"=>"<Event xmlns='http://schemas.microsoft.com/win/2004/08/events/event'><System><Provider Name='Microsoft-Windows-Sysmon' Guid='{5770385F-C22A-43E0-BF4C-06F5698FFBD9}'/><EventID>7</EventID><Version>3</Version><Level>4</Level><Task>7</Task><Opcode>0</Opcode><Keywords>0x8000000000000000</Keywords><TimeCreated SystemTime='2022-10-25T15:00:54.070439500Z'/><EventRecordID>17401</EventRecordID><Correlation/><Execution ProcessID='2096' ThreadID='2996'/><Channel>Microsoft-Windows-Sysmon/Operational</Channel><Computer>Win2016FBTest</Computer><Security UserID='S-1-5-18'/></System><EventData><Data Name='RuleName'>ImageLoad</Data><Data Name='UtcTime'>2022-10-25 15:00:54.064</Data><Data Name='ProcessGuid'>{45FC5480-FA26-6357-7361-000000000800}</Data><Data Name='ProcessId'>1712</Data><Data Name='Image'>C:\\Windows\\System32\\Configure-SMRemoting.exe</Data><Data Name='ImageLoaded'>C:\\Windows\\System32\\advapi32.dll</Data><Data Name='FileVersion'>10.0.14393.2969 (rs1_release.190503-1820)</Data><Data Name='Description'>Advanced Windows 32 Base API</Data><Data Name='Product'>Microsoft\xAE Windows\xAE Operating System</Data><Data Name='Company'>Microsoft Corporation</Data><Data Name='OriginalFileName'>advapi32.dll</Data><Data Name='Hashes'>SHA1=D616C4AEFF4DB7B2DD92332D118EDED28D298302,MD5=F5442C4B9A99C3AED71BED79AC46DAD1,SHA256=05F47403F3BD93FB11F39A5CB4D6E4DD08B35FF4FA3D4969D8E5396D38FB484B,IMPHASH=D2F471BB25AF6310EB67BD4EA99B4DBC</Data><Data Name='Signed'>true</Data><Data Name='Signature'>Microsoft Windows</Data><Data Name='SignatureStatus'>Valid</Data><Data Name='User'>WIN2016FBTEST\\Administrator</Data></EventData></Event>", "Message"=>"Image loaded:\r\nRuleName: ImageLoad\r\nUtcTime: 2022-10-25 15:00:54.064\r\nProcessGuid: {45FC5480-FA26-6357-7361-000000000800}\r\nProcessId: 1712\r\nImage: C:\\Windows\\System32\\Configure-SMRemoting.exe\r\nImageLoaded: C:\\Windows\\System32\\advapi32.dll\r\nFileVersion: 10.0.14393.2969 (rs1_release.190503-1820)\r\nDescription: Advanced Windows 32 Base API\r\nProduct: Microsoft\xAE Windows\xAE Operating System\r\nCompany: Microsoft Corporation\r\nOriginalFileName: advapi32.dll\r\nHashes: SHA1=D616C4AEFF4DB7B2DD92332D118EDED28D298302,MD5=F5442C4B9A99C3AED71BED79AC46DAD1,SHA256=05F47403F3BD93FB11F39A5CB4D6E4DD08B35FF4FA3D4969D8E5396D38FB484B,IMPHASH=D2F471BB25AF6310EB67BD4EA99B4DBC\r\nSigned: true\r\nSignature: Microsoft Windows\r\nSignatureStatus: Valid\r\nUser: WIN2016FBTEST\\Administrator", "hostname"=>"Win2016FBTest", "log-os"=>"windows", "log-app"=>"test", "client_ip"=>"10.1.1.165", "client_hostname"=>"10.1.1.165"}

Expected Behavior or What you need to ask

We only see this error with logs from Windows Server 2016 and 2012 R2, how should the right log format look like in order to ingest data successfully into OpenSearch?
...

Using Fluentd and OpenSearch plugin versions

  • OS version
    • ``Ubuntu LTS and Win 2016 Server`
  • Fluentd Version
    • td-agent 4.4.1 fluentd 1.15.2 (c32842297ed2c306f1b841a8f6e55bdd0f1cb27f)
  • OpenSearch plugin version
    • 2022-10-25 15:01:01 +0000 [info]: gem 'fluent-plugin-opensearch' version '1.0.7'
  • OpenSearch version
    • 2.2.0
@cosmo0920
Copy link
Collaborator

Could you use Render_Event_As_XML False instead of Render_Event_As_XML True to map hashmap to prevent JSON parsing error? Any reason to use Render_Event_As_XML True?

@leowinterde
Copy link
Author

Sure we could use Render_Event_As_XML False but harvesting any additionale info formatted as XML is a feature from the winevtlog which we need down the road in our pipeline. Isn't there any option do ingest this data faultless into opensearch?

@kaiohenricunha
Copy link

kaiohenricunha commented Apr 24, 2023

Any workaround? I've been trying to fix this for a while.

@ArthurSudbrackIbarra
Copy link

Also experiencing the same issue.

@kaiohenricunha
Copy link

kaiohenricunha commented Apr 24, 2023

Issue resolved by adding a custom dedot Fluentd ClusterFilter to the Fluentd configuration:

apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterFilter
metadata:
  labels:
    filter.fluentd.fluent.io/enabled: "true"
    filter.fluentd.fluent.io/name: "de-dot"
  name: de-dot
spec:
  filters:
    - customPlugin:
        config: |
          <filter **>
            @type dedot
            de_dot_separator _
            de_dot_nested ${FLUENTD_DEDOT_NESTED:=true}
          </filter>
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterFluentdConfig
metadata:
  labels:
    config.fluentd.fluent.io/enabled: "true"
  name: cluster-fluentd-config
spec:
  clusterFilterSelector:
    matchLabels:
      filter.fluentd.fluent.io/enabled: "true"
      filter.fluentd.fluent.io/name: "de-dot"
  clusterOutputSelector:
    matchLabels:
      output.fluentd.fluent.io/enabled: "true"
      output.fluentd.fluent.io/tenant: "core"
  watchedNamespaces: [] # watches all namespaces when empty
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: Fluentd
metadata:
  name: fluentd
  namespace: fluent-system
  labels:
    app.kubernetes.io/name: fluentd
spec:
  globalInputs:
    - forward:
        bind: 0.0.0.0
        port: 24224
  replicas: 1
  image: kubesphere/fluentd:${FLUENTD_IMAGE_TAG:=v1.15.3}
  imagePullSecrets:
    - name: image-pull-secret
  resources:
    limits:
      cpu: ${FLUENTD_CPU_LIMIT:=500m}
      memory: ${FLUENTD_MEMORY_LIMIT:=500Mi}
    requests:
      cpu: ${FLUENTD_CPU_REQUEST:=100m}
      memory: ${FLUENTD_MEMORY_REQUEST:=128Mi}
  fluentdCfgSelector:
    matchLabels:
      config.fluentd.fluent.io/enabled: "true"
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterOutput
metadata:
  name: cluster-output-opensearch
  labels:
    output.fluentd.fluent.io/enabled: "true"
    output.fluentd.fluent.io/tenant: "raas-core"
spec:
  outputs:
    - customPlugin:
        config: |
          <match **>
            @type opensearch
            host "${FLUENT_OPENSEARCH_HOST}"
            port 443
            logstash_format  true
            logstash_prefix logs-XXX-core
            scheme https
            log_os_400_reason true
            @log_level ${FLUENTD_OUTPUT_LOGLEVEL:=info}
            <endpoint>
              url "https://${FLUENT_OPENSEARCH_HOST}"
              region "${FLUENT_OPENSEARCH_REGION}"
              assume_role_arn "#{ENV['AWS_ROLE_ARN']}"
              assume_role_web_identity_token_file "#{ENV['AWS_WEB_IDENTITY_TOKEN_FILE']}"
            </endpoint>
          </match>

This is because field names containing dots can create ambiguity in certain data structures. For example, logs with the label kubernetes.labels.statefulset.kubernetes.io/pod-name will collide with the label kubernetes.labels.statefulset.kubernetes.io/pod-name.keyword.

In this case, OpenSearch will think that the json of this log should have the following format and try to repeat it twice:

{
  "kubernetes": {
    "labels": {
      "statefulset": {
        "kubernetes": {
          "io": {
            "pod-name": "some_value"
          }
        }
      }
    }
  }
}

This misinterpretation can cause unexpected behavior during indexing and querying, and might result in data loss or errors. By replacing dots with underscores using the de_dot filter, we can avoid such ambiguity and ensure that the field name is correctly interpreted.

After applying the dedot filter, it becomes:

{
  "kubernetes": {
    "labels": {
      "statefulset_kubernetes_io/pod-name": "some_pod_name",
      "statefulset_kubernetes_io/pod-name.keyword": "some_pod_name_keyword"
    }
  }
}

I'm using the fluent-operator, so the above configuration will be rendered like this:

<ROOT>
  <system>
    rpc_endpoint "127.0.0.1:24444"
    log_level info
    workers 1
  </system>
  <source>
    @type forward
    bind "0.0.0.0"
    port 24224
  </source>
  <match **>
    @id main
    @type label_router
    <route>
      @label "@b129def99623e1778c83fa647cbb2c60"
      <match>
      </match>
    </route>
  </match>
  <label @b129def99623e1778c83fa647cbb2c60>
    <filter **>
      @type dedot
      de_dot_separator "_"
      de_dot_nested true
    </filter>
    <match **>
      @type opensearch
      host "XXXX.XXXX-XXX-XXX.es.amazonaws.com"
      port 443
      logstash_format true
      logstash_prefix "logs-XXX-core"
      scheme https
      log_os_400_reason true
      @log_level "info"
      <endpoint>
        url https://XXXX.us-XXX-XXXX.es.amazonaws.com
        region "us-west-2"
        assume_role_arn "arn:aws:iam::XXXX:role/XXXX/fluentd-os-access-us-west-2"
        assume_role_web_identity_token_file "/var/run/secrets/eks.amazonaws.com/serviceaccount/token"
      </endpoint>
    </match>
  </label>
  <match **>
    @type null
    @id main-no-output
  </match>
  <label @FLUENT_LOG>
    <match fluent.*>
      @type null
      @id main-fluentd-log
    </match>
  </label>
</ROOT>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants