Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bignum too big to convert into `long long' #4258

Open
bjg2 opened this issue Jul 28, 2023 · 2 comments
Open

bignum too big to convert into `long long' #4258

bjg2 opened this issue Jul 28, 2023 · 2 comments

Comments

@bjg2
Copy link

bjg2 commented Jul 28, 2023

Describe the bug

Flushing the buffer fails with RangeError - "bignum too big to convert into `long long'".

To Reproduce

Not exactly repro, but this is my conf setting. I'm not sure what bignum is too big, I guess some data comes across from time to time that is bigger than long long, I guess that passes through JSON parser well, but I'm not sure why it fails in mongo buffer.

Expected behavior

Anything else would be better behavior for me, to set -1 instead of real value, to set the biggest long long value, to give some replacement somehow, to remove that row completely , anything except for whole chunk failing constantly. It seems that this is leading to fluentd getting stuck after some time.

Your Environment

- Fluentd version: 1.15-1
- TD Agent version: /
- Operating system: alpine linux, v3.16.0
- Kernel version: 5.10.0-0.bpo.15-amd64

Your Configuration

# Inputs from container logs
<source>
  @type tail
  @id in_tail_container_logs
  path /var/log/containers/*.log
  exclude_path ["/var/log/containers/cilium*"]
  pos_file /var/log/fluentd.log.pos
  read_from_head
  tag kubernetes.*
  <parse>
    @type cri
  </parse>
</source>

# Merge logs split into multiple lines
<filter kubernetes.**>
  @type concat
  key message
  use_partial_cri_logtag true
  partial_cri_logtag_key logtag
  partial_cri_stream_key stream
  separator ""
</filter>

# Enriches records with Kubernetes metadata
<filter kubernetes.**>
  @type kubernetes_metadata
</filter>

# Prettify kubernetes metadata
<filter kubernetes.**>
  @type record_transformer
  enable_ruby
  <record>
    nodeName ${record.dig("kubernetes", "host")}
    namespaceName ${record.dig("kubernetes", "namespace_name")}
    podName ${record.dig("kubernetes", "pod_name")}
    containerName ${record.dig("kubernetes", "container_name")}
    containerImage ${record.dig("kubernetes", "container_image")}
  </record>
  remove_keys docker,kubernetes
</filter>
 
# Expands inner json
<filter kubernetes.**>
  @type parser
  format json
  key_name message
  reserve_data true
  remove_key_name_field true
  emit_invalid_record_to_error false
  time_format %Y-%m-%dT%H:%M:%S.%NZ
  time_key time
  keep_time_key
</filter>

# Mongodb keys should not have dollar or a dot inside
<filter kubernetes.**>
  @type rename_key
  replace_rule1 \$ [dollar]
</filter>

# Mongodb keys should not have dollar or a dot inside
<filter kubernetes.**>
  @type rename_key
  replace_rule1 \. [dot]
</filter>

# Outputs to log db
<match kubernetes.**>
  @type mongo

  connection_string "#{ENV['MONGO_ANALYTICS_DB_HOST']}"
  collection logs

  <buffer>
    @type file
    path /var/log/file-buffer
    flush_thread_count 8
    flush_interval 3s
    chunk_limit_size 32M
    flush_mode interval
    retry_max_interval 60
    retry_forever true
  </buffer>
</match>

Your Error Log

2023-07-28 07:44:22 +0000 [warn]: #0 retry succeeded. chunk_id="6018738263cb959ce87d310203d5692c"
2023-07-28 07:44:23 +0000 [warn]: #0 failed to flush the buffer. retry_times=0 next_retry_time=2023-07-28 07:44:24 +0000 chunk="5f44a1f448d6b3feab006a95a5405527" error_class=RangeError error="bignum too big to convert into `long long'"
  2023-07-28 07:44:23 +0000 [warn]: #0 suppressed same stacktrace
2023-07-28 07:44:24 +0000 [warn]: #0 failed to flush the buffer. retry_times=1 next_retry_time=2023-07-28 07:44:27 +0000 chunk="5f44a1f448d6b3feab006a95a5405527" error_class=RangeError error="bignum too big to convert into `long long'"
  2023-07-28 07:44:24 +0000 [warn]: #0 suppressed same stacktrace
2023-07-28 07:44:26 +0000 [warn]: #0 failed to flush the buffer. retry_times=2 next_retry_time=2023-07-28 07:44:31 +0000 chunk="5f44a1f448d6b3feab006a95a5405527" error_class=RangeError error="bignum too big to convert into `long long'"
  2023-07-28 07:44:26 +0000 [warn]: #0 suppressed same stacktrace
2023-07-28 07:44:26 +0000 [warn]: #0 failed to flush the buffer. retry_times=2 next_retry_time=2023-07-28 07:44:31 +0000 chunk="6017a8f8ca2765e2e8eeb789257a6e08" error_class=RangeError error="bignum too big to convert into `long long'"
  2023-07-28 07:44:26 +0000 [warn]: #0 suppressed same stacktrace
2023-07-28 07:44:30 +0000 [warn]: #0 failed to flush the buffer. retry_times=3 next_retry_time=2023-07-28 07:44:39 +0000 chunk="6017a8f8ca2765e2e8eeb789257a6e08" error_class=RangeError error="bignum too big to convert into `long long'"
  2023-07-28 07:44:30 +0000 [warn]: #0 suppressed same stacktrace
2023-07-28 07:44:38 +0000 [warn]: #0 failed to flush the buffer. retry_times=4 next_retry_time=2023-07-28 07:44:54 +0000 chunk="6017a8f8ca2765e2e8eeb789257a6e08" error_class=RangeError error="bignum too big to convert into `long long'"
  2023-07-28 07:44:38 +0000 [warn]: #0 suppressed same stacktrace
2023-07-28 07:44:38 +0000 [warn]: #0 failed to flush the buffer. retry_times=4 next_retry_time=2023-07-28 07:44:56 +0000 chunk="5f44a1f448d6b3feab006a95a5405527" error_class=RangeError error="bignum too big to convert into `long long'"
  2023-07-28 07:44:38 +0000 [warn]: #0 suppressed same stacktrace
2023-07-28 07:44:41 +0000 [info]: #0 stats - namespace_cache_size: 4, pod_cache_size: 18, namespace_cache_api_updates: 5, pod_cache_api_updates: 5, id_cache_miss: 5, namespace_cache_host_updates: 4, pod_cache_host_updates: 18
2023-07-28 07:44:56 +0000 [warn]: #0 failed to flush the buffer. retry_times=5 next_retry_time=2023-07-28 07:45:29 +0000 chunk="5f44a1f448d6b3feab006a95a5405527" error_class=RangeError error="bignum too big to convert into `long long'"
  2023-07-28 07:44:56 +0000 [warn]: #0 suppressed same stacktrace
2023-07-28 07:44:56 +0000 [warn]: #0 failed to flush the buffer. retry_times=5 next_retry_time=2023-07-28 07:45:30 +0000 chunk="6017a8f8ca2765e2e8eeb789257a6e08" error_class=RangeError error="bignum too big to convert into `long long'"
  2023-07-28 07:44:56 +0000 [warn]: #0 suppressed same stacktrace
2023-07-28 07:45:11 +0000 [info]: #0 stats - namespace_cache_size: 4, pod_cache_size: 18, namespace_cache_api_updates: 5, pod_cache_api_updates: 5, id_cache_miss: 5, namespace_cache_host_updates: 4, pod_cache_host_updates: 18
2023-07-28 07:45:30 +0000 [warn]: #0 failed to flush the buffer. retry_times=6 next_retry_time=2023-07-28 07:46:32 +0000 chunk="5f0bc3088db314e9e88e1a6920da4a11" error_class=RangeError error="bignum too big to convert into `long long'"
  2023-07-28 07:45:30 +0000 [warn]: #0 suppressed same stacktrace
2023-07-28 07:45:30 +0000 [warn]: #0 failed to flush the buffer. retry_times=6 next_retry_time=2023-07-28 07:46:33 +0000 chunk="6017a8f8ca2765e2e8eeb789257a6e08" error_class=RangeError error="bignum too big to convert into `long long'"
  2023-07-28 07:45:30 +0000 [warn]: #0 suppressed same stacktrace
2023-07-28 07:45:30 +0000 [warn]: #0 failed to flush the buffer. retry_times=6 next_retry_time=2023-07-28 07:46:25 +0000 chunk="5f44a1f448d6b3feab006a95a5405527" error_class=RangeError error="bignum too big to convert into `long long'"
  2023-07-28 07:45:30 +0000 [warn]: #0 suppressed same stacktrace
2023-07-28 07:45:41 +0000 [info]: #0 stats - namespace_cache_size: 4, pod_cache_size: 18, namespace_cache_api_updates: 5, pod_cache_api_updates: 5, id_cache_miss: 5, namespace_cache_host_updates: 4, pod_cache_host_updates: 18
2023-07-28 07:46:11 +0000 [info]: #0 stats - namespace_cache_size: 4, pod_cache_size: 18, namespace_cache_api_updates: 5, pod_cache_api_updates: 5, id_cache_miss: 5, namespace_cache_host_updates: 4, pod_cache_host_updates: 18
2023-07-28 07:46:25 +0000 [warn]: #0 failed to flush the buffer. retry_times=7 next_retry_time=2023-07-28 07:47:27 +0000 chunk="6017a8f8ca2765e2e8eeb789257a6e08" error_class=RangeError error="bignum too big to convert into `long long'"
  2023-07-28 07:46:25 +0000 [warn]: #0 suppressed same stacktrace
2023-07-28 07:46:25 +0000 [warn]: #0 failed to flush the buffer. retry_times=7 next_retry_time=2023-07-28 07:47:25 +0000 chunk="5f44a1f448d6b3feab006a95a5405527" error_class=RangeError error="bignum too big to convert into `long long'"
  2023-07-28 07:46:25 +0000 [warn]: #0 suppressed same stacktrace
2023-07-28 07:46:41 +0000 [info]: #0 stats - namespace_cache_size: 4, pod_cache_size: 18, namespace_cache_api_updates: 5, pod_cache_api_updates: 5, id_cache_miss: 5, namespace_cache_host_updates: 4, pod_cache_host_updates: 18

Additional context

It is running in digital ocean kubernetes, as DaemonSet.

@bjg2
Copy link
Author

bjg2 commented Jul 28, 2023

Even more, I cannot understand all the buffer settings. If I remove retry_forever and retry_timeout of 24h (86400), I still don't get the issue chunks to be deleted, as these settings seem not to be on chunk level? I have chunks queued for months that keep failing, some even from last year, and they cannot get deleted because these settings are not on chunk level, whenever some chunk is flushed as expected retry_times, next_retry_time, everything is reset to the init state for all chunks, even the issue ones?

@bjg2
Copy link
Author

bjg2 commented Oct 12, 2023

Just happened again, fluentd got completely stuck, this time in minikube. Same, it was reporting 'bignum too big' for two chunks, and when I removed the chunks and restarted fluentd, it was unstuck (simple restart didn't do it, had to remove those chunks before restart). Chunks in question are attached, pleasae advise, as I don't know what to do to mitigate this issue.
logs.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant