SIGBUS error with 1.2.2 #8099

Zelldon · 2021-10-29T14:26:51Z

Describe the bug

Customer has reported that they got an SIGBUS error with the latest patch release and running their benchmark, it looks like it is happening in the journal.

Impact: Since the customer doesn't run on kubernetes the system had to be restarted.

https://www.geeksforgeeks.org/segmentation-fault-sigsegv-vs-bus-error-sigbus/
2) Bus Error (also known as SIGBUS and is usually signal 10) occur when a process is trying to access memory that the CPU cannot physically address.In other words the memory tried to access by the program is not a valid memory address.It caused due to alignment issues with the CPU (eg. trying to read a long from an address which isn’t a multiple of 4). SIGBUS is abbrivation for “Bus Error”.

SIGBUS signal occurs in below cases,
-> Program instructs the CPU to read or write a specific physical memory address which is not valid / Requested physical address is unrecognized by the whole computer system.
-> Unaligned access of memory (For example, if multi-byte accesses must be 16 bit-aligned, addresses (given in bytes) at 0, 2, 4, 6, and so on would be considered aligned and therefore accessible, while addresses 1, 3, 5, and so on would be considered unaligned.)

To Reproduce

I think we can ask them what they exactly doing, what I know is that they have 24 partitions and 4 brokers with replication factor 4.

Expected behavior

No sigbus I assume?

Log/Stacktrace

�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:33.928 [] [raft-server-2-raft-partition-partition-6] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 33400058 (first log index: 33322496)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:33.971 [] [raft-server-2-raft-partition-partition-22] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 33226084 (first log index: 33091065)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:33.977 [] [raft-server-2-raft-partition-partition-24] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 51869258 (first log index: 51728804)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:33.984 [] [raft-server-2-raft-partition-partition-10] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 34657647 (first log index: 34533361)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:33.999 [] [raft-server-2-raft-partition-partition-6] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 33400058 (first log index: 33322496)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:34.045 [] [raft-server-2-raft-partition-partition-22] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 33226084 (first log index: 33091065)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:34.048 [] [raft-server-2-raft-partition-partition-24] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 51869258 (first log index: 51728804)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:34.061 [] [raft-server-2-raft-partition-partition-10] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 34657647 (first log index: 34533361)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:34.070 [] [raft-server-2-raft-partition-partition-6] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 33400058 (first log index: 33322496)
�[36mzeebe_broker2 |�[0m [thread 82 also had an error]
�[36mzeebe_broker2 |�[0m [thread 104 also had an error][thread 91 also had an error]
�[36mzeebe_broker2 |�[0m 
�[36mzeebe_broker2 |�[0m [thread 99 also had an error]
�[36mzeebe_broker2 |�[0m #
�[36mzeebe_broker2 |�[0m # A fatal error has been detected by the Java Runtime Environment:
�[36mzeebe_broker2 |�[0m #
�[36mzeebe_broker2 |�[0m #  [thread 83 also had an error]
�[36mzeebe_broker2 |�[0m [thread 93 also had an error]
�[36mzeebe_broker2 |�[0m [thread 49 also had an error][thread 48 also had an error]
�[36mzeebe_broker2 |�[0m 
�[36mzeebe_broker2 |�[0m SIGBUS (0x7) at pc=0x00007f91eb54d169, pid=8, tid=92
�[36mzeebe_broker2 |�[0m #
�[36mzeebe_broker2 |�[0m # JRE version: OpenJDK Runtime Environment 18.9 (11.0.13+8) (build 11.0.13+8)
�[36mzeebe_broker2 |�[0m # Java VM: OpenJDK 64-Bit Server VM 18.9 (11.0.13+8, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
�[36mzeebe_broker2 |�[0m # Problematic frame:
�[36mzeebe_broker2 |�[0m # v  ~StubRoutines::jbyte_disjoint_arraycopy
�[36mzeebe_broker2 |�[0m #
�[36mzeebe_broker2 |�[0m # Core dump will be written. Default location: /var/swiftcore/core.%e.8.%h.%t
�[36mzeebe_broker2 |�[0m #
�[36mzeebe_broker2 |�[0m # An error report file with more information is saved as:
�[36mzeebe_broker2 |�[0m # /usr/local/zeebe/hs_err_pid8.log
�[36mzeebe_broker2 |�[0m Compiled method (c2) 103585123 19667   !   4       io.camunda.zeebe.journal.file.SegmentedJournalWriter::append (54 bytes)
�[36mzeebe_broker2 |�[0m  total in heap  [0x00007f91f4170a90,0x00007f91f4174fe8] = 17752
�[36mzeebe_broker2 |�[0m  relocation     [0x00007f91f4170c08,0x00007f91f4170da0] = 408
�[36mzeebe_broker2 |�[0m  main code      [0x00007f91f4170da0,0x00007f91f41730a0] = 8960
�[36mzeebe_broker2 |�[0m  stub code      [0x00007f91f41730a0,0x00007f91f4173120] = 128
�[36mzeebe_broker2 |�[0m  oops           [0x00007f91f4173120,0x00007f91f4173148] = 40
�[36mzeebe_broker2 |�[0m  metadata       [0x00007f91f4173148,0x00007f91f4173340] = 504
�[36mzeebe_broker2 |�[0m  scopes data    [0x00007f91f4173340,0x00007f91f4174738] = 5112
�[36mzeebe_broker2 |�[0m  scopes pcs     [0x00007f91f4174738,0x00007f91f4174dd8] = 1696
�[36mzeebe_broker2 |�[0m  dependencies   [0x00007f91f4174dd8,0x00007f91f4174e00] = 40
�[36mzeebe_broker2 |�[0m  handler table  [0x00007f91f4174e00,0x00007f91f4174ef0] = 240
�[36mzeebe_broker2 |�[0m  nul chk table  [0x00007f91f4174ef0,0x00007f91f4174fe8] = 248
�[36mzeebe_broker2 |�[0m Could not load hsdis-amd64.so; library not loadable; PrintAssembly is disabled
�[36mzeebe_broker2 |�[0m #
�[36mzeebe_broker2 |�[0m # If you would like to submit a bug report, please visit:
�[36mzeebe_broker2 |�[0m #   https://bugreport.java.com/bugreport/crash.jsp
�[36mzeebe_broker2 |�[0m #
�[36mzeebe_broker2 exited with code 134

hs_err_pid8.log

Environment:

OS:
Zeebe Version: 1.2.2
Configuration: 4 broker, 24 partitions, bare metal

Support Case: SUPPORT-11966

The text was updated successfully, but these errors were encountered:

npepinpe · 2021-10-29T14:38:37Z

They're not on K8S but are they using containers at all, or running bare metal?

This is the same customer which had too many segments memory mapped, right? Could it be they have so many they ran out of memory to address? 😄 ~~Are they running on a 32bit system or a 64bit system?~~ Nevermind I can see they're running a 64 bit JDK.

Otherwise, regarding potentially alignment issues, would be interesting to know what CPU arch they run on, but it's not crazy for us to enforce page alignment for our general (and is probably even be beneficial to do so), so we could do it regardless of the issue.

There's other cases where this can occur which might point to a bug. If a file is truncated while mapped, further reads may cause SIGBUS in some JVMs (I believe this was patched but I don't know versions and if it was backported).

Zelldon · 2021-10-29T14:42:15Z

So we have no alignment ? 🤔 the cpu you can see in the hot spot error report file.

npepinpe · 2021-10-29T14:55:31Z

No, but I'm not sure in this case the alignment is the problem (though I could be wrong). With mmapd files, sigbus more likely refers to us writing/reading outside of the buffer. This can mean there was not enough virtual space (though the sigbus would happen when we map it, not on append), the file was truncated concurrently (unlikely), we're writing to a buffer that was since unmapped (though I would expect a SIGSEV here), or there was some weird I/O error that wasn't well handled by the JVM (e.g. writing to a network storage which was detached concurrently).

See https://www.sublimetext.com/blog/articles/use-mmap-with-care#:~:text=SIGBUS%20(bus%20error)%20is%20a,we%20failed%20to%20read%2Fwrite for example

Zelldon · 2021-11-01T07:49:42Z

What I can see from the customer data is that the Broker which received the SIGBUS went out of disk space and was not able to delete on several partitions data, on other partitions it was able to delete data. No exporters are configured.

We see endless log statements of:

�[0m 2021-10-27 14:14:17.730 [] [raft-server-2-raft-partition-partition-5] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 49499482 (first log index: 49463401)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:17.730 [] [raft-server-2-raft-partition-partition-5] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 49499482 (first log index: 49463401)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:17.731 [] [raft-server-2-raft-partition-partition-5] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 49499482 (first log index: 49463401)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:17.731 [] [raft-server-2-raft-partition-partition-5] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 49499482 (first log index: 49463401)

SIGBUS message which includes CPD and jdk

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0x7) at pc=0x00007f91eb54d169, pid=8, tid=92
#
# JRE version: OpenJDK Runtime Environment 18.9 (11.0.13+8) (build 11.0.13+8)
# Java VM: OpenJDK 64-Bit Server VM 18.9 (11.0.13+8, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# v  ~StubRoutines::jbyte_disjoint_arraycopy
#
# Core dump will be written. Default location: /var/swiftcore/core.%e.8.%h.%t
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#
---------------  S U M M A R Y ------------

Command Line: -Xmx6G -XX:+UseG1GC -XX:+ExplicitGCInvokesConcurrent -XX:MaxInlineLevel=15 -Xlog:gc*=debug:file=/usr/local/zeebe/logs/gc.log -XX:+HeapDumpOnOutOfMemoryError -XX:+ExitOnOutOfMemoryError -Xms128m --illegal-access=deny -XX:+ExitOnOutOfMemoryError -Dapp.name=broker -Dapp.pid=8 -Dapp.repo=/usr/local/zeebe/lib -Dapp.home=/usr/local/zeebe -Dbasedir=/usr/local/zeebe io.camunda.zeebe.broker.StandaloneBroker

Host: Intel(R) Xeon(R) Platinum 8270 CPU @ 2.70GHz, 12 cores, 31G, Debian GNU/Linux 11 (bullseye)
Time: Wed Oct 27 14:14:34 2021 UTC elapsed time: 103585.121013 seconds (1d 4h 46m 25s)

Zelldon · 2021-11-01T07:50:31Z

@deepthidevaki @npepinpe can we have an endless loop in deleting or why logs can be printed for same partition so often in ms ?

�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:18.668 [] [raft-server-2-raft-partition-partition-5] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 49499482 (first log index: 49463401)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:18.669 [] [raft-server-2-raft-partition-partition-5] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 49499482 (first log index: 49463401)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:18.670 [] [raft-server-2-raft-partition-partition-5] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 49499482 (first log index: 49463401)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:18.671 [] [raft-server-2-raft-partition-partition-5] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 49499482 (first log index: 49463401)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:18.672 [] [raft-server-2-raft-partition-partition-5] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 49499482 (first log index: 49463401)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:18.672 [] [raft-server-2-raft-partition-partition-5] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 49499482 (first log index: 49463401)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:18.673 [] [raft-server-2-raft-partition-partition-5] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 49499482 (first log index: 49463401)
�[36mzeebe_broker2 |�[0m 2021-10-27 14:14:18.675 [] [raft-server-2-raft-partition-partition-5] DEBUG
�[36mzeebe_broker2 |�[0m       io.camunda.zeebe.journal.file.SegmentedJournal - No segments can be deleted with index < 49499482 (first log index: 49463401)

deepthidevaki · 2021-11-01T07:56:29Z

Deleting logs are not retried. So I expect this log only once when a new snapshot is taken or received.

deepthidevaki · 2021-11-01T07:59:20Z

Oh.. wait! May be we try to compact when raft detects it is OOD? May be in LeadeRole or FollowerRole.

deepthidevaki · 2021-11-01T08:03:55Z

🤦

https://github.com/camunda-cloud/zeebe/blob/7e33b488a1fed1c8fbf5c2fffdede1275007795f/atomix/cluster/src/main/java/io/atomix/raft/roles/PassiveRole.java#L666-L669

deepthidevaki · 2021-11-01T08:37:09Z

@Zelldon FYI OOD may be due to this issue #7767

npepinpe · 2021-11-01T08:51:29Z

Please timebox some effort to root cause it, then put it back in triage when we have more information on this.

Zelldon · 2021-11-01T09:14:39Z

SIGBUS seems of happen with java related to full /tmp folders

npepinpe · 2021-11-01T13:28:49Z

From the looks, it's usually because it's memory mapping the files in /tmp - which succeeds, but when it tries to write it fails. I'm not sure why it's SIGBUS and not the InternalError we've previously observed - it could be because in this case, the JVM is not using its own MappedByteBuffer API and using something else? I would be curious to try a small test which files a disk, maps a file and then try to write and see if I get a SIGBUS, or if this only happens for JVM reports written to /tmp 🤔

Zelldon · 2021-11-01T13:34:51Z

Could it be that if the disk is full and we end in the endless loop like #8103 we print so many logs that we write tmp full and end in a SIGBUS because the JVM fails with the full /tmp?

npepinpe · 2021-11-01T13:41:08Z

Could be! I would feel a bit better about this than knowing we have to handle SIGBUS with our own memory mapped files 😄

Zelldon · 2021-11-01T14:16:31Z

Was able to reproduce it.

{"severity":"DEBUG","logging.googleapis.com/sourceLocation":{"function":"storeExporterPositions","file":"ExporterPositionsDistributionService.java","line":44},"message":"Received new exporter state {MetricsExporter=674628}","serviceContext":{"service":"zeebe-broker","version":"zell-disk-test"},"context":{"threadId":27,"partitionId":"3","threadPriority":5,"loggerName":"io.camunda.zeebe.broker.exporter","threadName":"Broker-1-zb-fs-workers-2","actor-name":"Broker-1-Exporter-3"},"timestampSeconds":1635775939,"timestampNanos":140792000}
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0x7) at pc=0x00007fa2e754d162, pid=7, tid=62
#
# JRE version: OpenJDK Runtime Environment 18.9 (11.0.13+8) (build 11.0.13+8)
# Java VM: OpenJDK 64-Bit Server VM 18.9 (11.0.13+8, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# v  ~StubRoutines::jbyte_disjoint_arraycopy
#
# Core dump will be written. Default location: /core.%e.7.%t
#
# An error report file with more information is saved as:
# /usr/local/zeebe/data/zeebe_error7.log
[thread 60 also had an error]
[thread 61 also had an error]
Compiled method (c2)  660342 9894       4       io.camunda.zeebe.journal.file.record.SBESerializer::writeData (144 bytes)
 total in heap  [0x00007fa2ef710d90,0x00007fa2ef712d20] = 8080
 relocation     [0x00007fa2ef710f08,0x00007fa2ef710fd8] = 208
 main code      [0x00007fa2ef710fe0,0x00007fa2ef711fa0] = 4032
 stub code      [0x00007fa2ef711fa0,0x00007fa2ef711fc8] = 40
 oops           [0x00007fa2ef711fc8,0x00007fa2ef711fe0] = 24
 metadata       [0x00007fa2ef711fe0,0x00007fa2ef7120e0] = 256
 scopes data    [0x00007fa2ef7120e0,0x00007fa2ef712848] = 1896
 scopes pcs     [0x00007fa2ef712848,0x00007fa2ef712c48] = 1024
 dependencies   [0x00007fa2ef712c48,0x00007fa2ef712c58] = 16
 handler table  [0x00007fa2ef712c58,0x00007fa2ef712c88] = 48
 nul chk table  [0x00007fa2ef712c88,0x00007fa2ef712d20] = 152
Could not load hsdis-amd64.so; library not loadable; PrintAssembly is disabled
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#

After failing once it seems to fail in a loop, can't access anymore the pod

  State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    134
      Started:      Mon, 01 Nov 2021 15:15:02 +0100
      Finished:     Mon, 01 Nov 2021 15:15:30 +0100
    Ready:          False

HOW

zeebe-values.yaml.txt

I setup the cluster (without elastic and without elastic exporter), run a benchmark, checked who is follower and created several big files with to fill the disk

dd if=/dev/zero of=filename bs=1G count=1

root@zell-disk-test-zeebe-1:/usr/local/zeebe/data# df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay          95G  5.5G   89G   6% /
tmpfs            64M     0   64M   0% /dev
tmpfs            30G     0   30G   0% /sys/fs/cgroup
/dev/sda1        95G  5.5G   89G   6% /exporters
shm              64M     0   64M   0% /dev/shm
/dev/sdc        4.9G  4.9G     0 100% /usr/local/zeebe/data
tmpfs            30G   12K   30G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs            30G     0   30G   0% /proc/acpi
tmpfs            30G     0   30G   0% /proc/scsi
tmpfs            30G     0   30G   0% /sys/firmware

Unfortunately the error log is empty after several restarts 🤷

root@zell-disk-test-zeebe-1:/usr/local/zeebe/data# ls -la
total 3758356
drwxr-xr-x 4 root  root       4096 Nov  1 14:17 .
drwxrwxr-x 1 zeebe root       4096 Nov  1 14:18 ..
-rw-r--r-- 1 root  root 1073741824 Nov  1 14:10 filename
-rw-r--r-- 1 root  root 1073741824 Nov  1 14:10 filename1
-rw-r--r-- 1 root  root 1073741824 Nov  1 14:10 filename2
-rw-r--r-- 1 root  root  626978816 Nov  1 14:10 filename3
-rw-r--r-- 1 root  root          0 Nov  1 14:24 gc.log
-rw-r--r-- 1 root  root     302538 Nov  1 14:11 gc.log.0
-rw-r--r-- 1 root  root        928 Nov  1 14:13 gc.log.1
-rw-r--r-- 1 root  root          0 Nov  1 14:14 gc.log.2
-rw-r--r-- 1 root  root          0 Nov  1 14:15 gc.log.3
-rw-r--r-- 1 root  root          0 Nov  1 14:16 gc.log.4
drwx------ 2 root  root      16384 Nov  1 14:01 lost+found
drwxr-xr-x 3 root  root       4096 Nov  1 14:01 raft-partition
-rw-r--r-- 1 root  root          0 Nov  1 14:12 zeebe_error7.log

npepinpe · 2021-11-01T14:27:00Z

Are you filling the data volume or /tmp?

Zelldon · 2021-11-01T14:29:00Z

Please see above, the usr/local/zeebe/data

npepinpe · 2021-11-01T14:31:26Z

So writing to a memory mapped file when there is no space doesn't produce an InternalError but instead a SIGBUS? How come last time we observed it was an internal error?

Any rate, the fix is then to pre-allocate files. We've already discussed this, pre-allocate the files before memory mapping them to ensure we have enough disk space and don't need to deal with weird errors. See #7607.

EDIT: also nice job on the quick reproduction 🎉

npepinpe · 2021-11-02T08:43:37Z

I'll leave this in the backlog and instead we'll focus on #7607 and #7767 which should greatly mitigate this.

9731: Preallocate segment files r=npepinpe a=npepinpe ## Description This PR introduces segment file pre-allocation in the journal. This is on by default, but can be disabled via an experimental configuration option. At the moment, the pre-allocation is done in a "dumb" fashion - we allocate a 4Kb blocks of zeroes, and write this until we've reached the expected file length. Note that this means there may be one extra block allocated on disk. One thing to note, to verify this, we used [jnr-posix](https://github.com/jnr/jnr-posix). The reason behind this is we want to know the actual number of blocks on disk reserved for this file. `Files#size`, or `File#length`, return the reported file size, which is part of the file's metadata (on UNIX systems anyway). If you mmap a file with a size of 1Mb, write one byte, then flush it, the reported size will be 1Mb, but the actual size on disk will be a single block (on most modern UNIX systems anyway). By using [stat](https://linux.die.net/man/2/stat), we can get the actual file size in terms of 512-bytes allocated blocks, so we get a pretty accurate measurement of the actual disk space used by the file. I would've like to capture this in a test utility, but since `test-util` depends on `util`, there wasn't an easy way to do this, so I just copied the method in two places. One possibility I thought of is moving the whole pre-allocation stuff in `journal`, since we only use it there. The only downside I can see there is about discovery and cohesion, but I'd like to hear your thoughts on this. A follow-up PR will come which will optimize the pre-allocation by using the [posix_fallocate](https://man7.org/linux/man-pages/man3/posix_fallocate.3.html) on POSIX systems. Finally, I opted for an experimental configuration option instead of a feature flag. My reasoning is that it isn't a "new" feature, but instead we want to option of disabling this (for performance reasons potentially). So it's more of an advanced option. But I'd also like to hear your thoughts here. ## Related issues closes #6504 closes #8099 related to #7607 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>

9777: [Backports stable/8.0] Preallocate segment files r=npepinpe a=npepinpe ## Description Backports #9731 to 8.0.x. ## Related issues closes #6504 closes #8099 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>

9778: [Backports stable/1.3] Preallocate segment files r=npepinpe a=npepinpe ## Description Backports #9731 to 1.3.x. ## Related issues closes #6504 closes #8099 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>

Zelldon added kind/bug Categorizes an issue or PR as a bug scope/broker Marks an issue or PR to appear in the broker section of the changelog labels Oct 29, 2021

npepinpe added the support Marks an issue as related to a customer support request label Nov 1, 2021

npepinpe added the 1.2.1 label Nov 1, 2021

npepinpe added this to Planned in Zeebe Nov 1, 2021

npepinpe moved this from Planned to Ready in Zeebe Nov 1, 2021

npepinpe assigned Zelldon Nov 1, 2021

Zelldon mentioned this issue Nov 1, 2021

Endless loop in deleting #8103

Closed

Zelldon moved this from Ready to In progress in Zeebe Nov 1, 2021

Zelldon removed this from In progress in Zeebe Nov 1, 2021

Zelldon removed their assignment Nov 2, 2021

npepinpe removed the 1.2.1 label Apr 11, 2022

pihme added the team/distributed label May 31, 2022

npepinpe mentioned this issue Jul 8, 2022

Preallocate segment files #9731

Merged

10 tasks

menski removed the team/distributed label Jul 11, 2022

zeebe-bors-camunda bot closed this as completed in 6592e82 Jul 12, 2022

This was referenced Jul 12, 2022

[Backports stable/8.0] Preallocate segment files #9777

Merged

[Backports stable/1.3] Preallocate segment files #9778

Merged

npepinpe added version:1.3.13 release/8.0.5 version:8.1.0-alpha4 labels Aug 1, 2022

Zelldon added the version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0 label Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SIGBUS error with 1.2.2 #8099

SIGBUS error with 1.2.2 #8099

Zelldon commented Oct 29, 2021 •

edited by menski

npepinpe commented Oct 29, 2021 •

edited

Zelldon commented Oct 29, 2021

npepinpe commented Oct 29, 2021

Zelldon commented Nov 1, 2021

Zelldon commented Nov 1, 2021

deepthidevaki commented Nov 1, 2021

deepthidevaki commented Nov 1, 2021 •

edited

deepthidevaki commented Nov 1, 2021

deepthidevaki commented Nov 1, 2021

npepinpe commented Nov 1, 2021

Zelldon commented Nov 1, 2021 •

edited

npepinpe commented Nov 1, 2021

Zelldon commented Nov 1, 2021

npepinpe commented Nov 1, 2021

Zelldon commented Nov 1, 2021 •

edited

npepinpe commented Nov 1, 2021

Zelldon commented Nov 1, 2021

npepinpe commented Nov 1, 2021 •

edited

npepinpe commented Nov 2, 2021

SIGBUS error with 1.2.2 #8099

SIGBUS error with 1.2.2 #8099

Comments

Zelldon commented Oct 29, 2021 • edited by menski

npepinpe commented Oct 29, 2021 • edited

Zelldon commented Oct 29, 2021

npepinpe commented Oct 29, 2021

Zelldon commented Nov 1, 2021

Zelldon commented Nov 1, 2021

deepthidevaki commented Nov 1, 2021

deepthidevaki commented Nov 1, 2021 • edited

deepthidevaki commented Nov 1, 2021

deepthidevaki commented Nov 1, 2021

npepinpe commented Nov 1, 2021

Zelldon commented Nov 1, 2021 • edited

npepinpe commented Nov 1, 2021

Zelldon commented Nov 1, 2021

npepinpe commented Nov 1, 2021

Zelldon commented Nov 1, 2021 • edited

HOW

npepinpe commented Nov 1, 2021

Zelldon commented Nov 1, 2021

npepinpe commented Nov 1, 2021 • edited

npepinpe commented Nov 2, 2021

Zelldon commented Oct 29, 2021 •

edited by menski

npepinpe commented Oct 29, 2021 •

edited

deepthidevaki commented Nov 1, 2021 •

edited

Zelldon commented Nov 1, 2021 •

edited

Zelldon commented Nov 1, 2021 •

edited

npepinpe commented Nov 1, 2021 •

edited