Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: pthread_create failed: Resource temporarily unavailable #711

Open
erneestoc opened this issue Nov 2, 2023 · 5 comments
Open

bug: pthread_create failed: Resource temporarily unavailable #711

erneestoc opened this issue Nov 2, 2023 · 5 comments

Comments

@erneestoc
Copy link

Running into this issue consistently on macOS. Is there some configuration I should set to make it work correctly? I'm not seeing this error with the docker image.

runtime/cgo: pthread_create failed: Resource temporarily unavailable
SIGABRT: abort
PC=0x1a8aac724 m=2 sigcode=0

goroutine 0 [idle]:
runtime: g 0: unknown pc 0x1a8aac724
stack: frame={sp:0x16c5e2c00, fp:0x0} stack=[0x16bde7328,0x16c5e2f28)
0x000000016c5e2b00:  0x0000000000000001  0x00000001005e2ae8 
0x000000016c5e2b10:  0x0000000000000042  0x0000000000000000 
0x000000016c5e2b20:  0x0000000032aaaba2  0x0000000000000000 
0x000000016c5e2b30:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2b40:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2b50:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2b60:  0x0000000000000003  0x0000000000000000 
0x000000016c5e2b70:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2b80:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2b90:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2ba0:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2bb0:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2bc0:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2bd0:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2be0:  0x000000000000003c  0x0000000203ae9208 
0x000000016c5e2bf0:  0x000000016c5e2c10  0x42020001a898a938 
0x000000016c5e2c00: <0x0000000000000000  0x0000000000000003 
0x000000016c5e2c10:  0x000000016c5e2c48  0x000000016c5e3000 
0x000000016c5e2c20:  0x000000016c5e2c60  0x0a1c0001a89f1ae8 
0x000000016c5e2c30:  0x00000000007ff000  0xffffffff03ae9208 
0x000000016c5e2c40:  0x0000000000000000  0x00000000fffff9df 
0x000000016c5e2c50:  0x00000000007ff000  0x0000000203ae9208 
0x000000016c5e2c60:  0x000000016c5e2c90  0xaa168001048845b0 
0x000000016c5e2c70:  0x00000001a89f621d  0x0000000000000003 
0x000000016c5e2c80:  0x00000000007ff000  0x0000000000000023 
0x000000016c5e2c90:  0x000000016c5e2d00  0x00000001047e71e0 
0x000000016c5e2ca0:  0x0000000000000000  0xffffffff00000000 
0x000000016c5e2cb0:  0x0000000054484441  0x0000000000000000 
0x000000016c5e2cc0:  0x0000000000000000  0x0000000000800000 
0x000000016c5e2cd0:  0x00000000000008ff  0x0000000010010101 
0x000000016c5e2ce0:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2cf0:  0x000000000000007d  0x000000016c5e2ca8 
runtime: g 0: unknown pc 0x1a8aac724
stack: frame={sp:0x16c5e2c00, fp:0x0} stack=[0x16bde7328,0x16c5e2f28)
0x000000016c5e2b00:  0x0000000000000001  0x00000001005e2ae8 
0x000000016c5e2b10:  0x0000000000000042  0x0000000000000000 
0x000000016c5e2b20:  0x0000000032aaaba2  0x0000000000000000 
0x000000016c5e2b30:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2b40:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2b50:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2b60:  0x0000000000000003  0x0000000000000000 
0x000000016c5e2b70:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2b80:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2b90:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2ba0:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2bb0:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2bc0:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2bd0:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2be0:  0x000000000000003c  0x0000000203ae9208 
0x000000016c5e2bf0:  0x000000016c5e2c10  0x42020001a898a938 
0x000000016c5e2c00: <0x0000000000000000  0x0000000000000003 
0x000000016c5e2c10:  0x000000016c5e2c48  0x000000016c5e3000 
0x000000016c5e2c20:  0x000000016c5e2c60  0x0a1c0001a89f1ae8 
0x000000016c5e2c30:  0x00000000007ff000  0xffffffff03ae9208 
0x000000016c5e2c40:  0x0000000000000000  0x00000000fffff9df 
0x000000016c5e2c50:  0x00000000007ff000  0x0000000203ae9208 
0x000000016c5e2c60:  0x000000016c5e2c90  0xaa168001048845b0 
0x000000016c5e2c70:  0x00000001a89f621d  0x0000000000000003 
0x000000016c5e2c80:  0x00000000007ff000  0x0000000000000023 
0x000000016c5e2c90:  0x000000016c5e2d00  0x00000001047e71e0 
0x000000016c5e2ca0:  0x0000000000000000  0xffffffff00000000 
0x000000016c5e2cb0:  0x0000000054484441  0x0000000000000000 
0x000000016c5e2cc0:  0x0000000000000000  0x0000000000800000 
0x000000016c5e2cd0:  0x00000000000008ff  0x0000000010010101 
0x000000016c5e2ce0:  0x0000000000000000  0x0000000000000000 
0x000000016c5e2cf0:  0x000000000000007d  0x000000016c5e2ca8 

goroutine 1 [semacquire]:
runtime.gopark(0x10522f8e0?, 0x18?, 0x80?, 0x2d?, 0x104b4c600?)
	runtime/proc.go:398 +0xc8 fp=0x140092b77f0 sp=0x140092b77d0 pc=0x104060668
runtime.goparkunlock(...)
	runtime/proc.go:404
runtime.semacquire1(0x14000264dd0, 0xc8?, 0x1, 0x0, 0x20?)
	runtime/sema.go:160 +0x208 fp=0x140092b7840 sp=0x140092b77f0 pc=0x104072758
sync.runtime_Semacquire(0x140001618b8?)
	runtime/sema.go:62 +0x2c fp=0x140092b7880 sp=0x140092b7840 pc=0x1040901cc
sync.(*WaitGroup).Wait(0x14000264dc8)
	sync/waitgroup.go:116 +0x74 fp=0x140092b78a0 sp=0x140092b7880 pc=0x1040a1c74
golang.org/x/sync/errgroup.(*Group).Wait(0x14000264dc0)
	golang.org/x/sync@v0.1.0/errgroup/errgroup.go:53 +0x2c fp=0x140092b78c0 sp=0x140092b78a0 pc=0x10454ddec
main.run(0x140002cb600)
	github.com/buchgr/bazel-remote/v2/main.go:223 +0xed0 fp=0x140092b7bf0 sp=0x140092b78c0 pc=0x1047e2450
github.com/urfave/cli/v2.(*App).RunContext(0x140001dba40, {0x104c311e0?, 0x10525a6c0}, {0x140000320d0, 0x1, 0x1})
	github.com/urfave/cli/v2@v2.17.1/app.go:395 +0xbf4 fp=0x140092b7ec0 sp=0x140092b7bf0 pc=0x104748ff4
github.com/urfave/cli/v2.(*App).Run(...)
	github.com/urfave/cli/v2@v2.17.1/app.go:252
main.main()
	github.com/buchgr/bazel-remote/v2/main.go:55 +0x11c fp=0x140092b7f30 sp=0x140092b7ec0 pc=0x1047e151c
runtime.main()
	runtime/proc.go:267 +0x2bc fp=0x140092b7fd0 sp=0x140092b7f30 pc=0x10406020c
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x140092b7fd0 sp=0x140092b7fd0 pc=0x104094d44

goroutine 2 [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:398 +0xc8 fp=0x1400008cf90 sp=0x1400008cf70 pc=0x104060668
runtime.goparkunlock(...)
	runtime/proc.go:404
runtime.forcegchelper()
	runtime/proc.go:322 +0xb8 fp=0x1400008cfd0 sp=0x1400008cf90 pc=0x1040604c8
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x1400008cfd0 sp=0x1400008cfd0 pc=0x104094d44
created by runtime.init.6 in goroutine 1
	runtime/proc.go:310 +0x24

goroutine 3 [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:398 +0xc8 fp=0x1400008d760 sp=0x1400008d740 pc=0x104060668
runtime.goparkunlock(...)
	runtime/proc.go:404
runtime.bgsweep(0x0?)
	runtime/mgcsweep.go:321 +0x108 fp=0x1400008d7b0 sp=0x1400008d760 pc=0x10404b048
runtime.gcenable.func1()
	runtime/mgc.go:200 +0x28 fp=0x1400008d7d0 sp=0x1400008d7b0 pc=0x10403fab8
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x1400008d7d0 sp=0x1400008d7d0 pc=0x104094d44
created by runtime.gcenable in goroutine 1
	runtime/mgc.go:200 +0x6c

goroutine 4 [GC scavenge wait]:
runtime.gopark(0x140000561c0?, 0x1049f60b0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:398 +0xc8 fp=0x1400008df50 sp=0x1400008df30 pc=0x104060668
runtime.goparkunlock(...)
	runtime/proc.go:404
runtime.(*scavengerState).park(0x1052240c0)
	runtime/mgcscavenge.go:425 +0x5c fp=0x1400008df80 sp=0x1400008df50 pc=0x10404885c
runtime.bgscavenge(0x0?)
	runtime/mgcscavenge.go:658 +0xac fp=0x1400008dfb0 sp=0x1400008df80 pc=0x104048e1c
runtime.gcenable.func2()
	runtime/mgc.go:201 +0x28 fp=0x1400008dfd0 sp=0x1400008dfb0 pc=0x10403fa58
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x1400008dfd0 sp=0x1400008dfd0 pc=0x104094d44
created by runtime.gcenable in goroutine 1
	runtime/mgc.go:201 +0xac

goroutine 5 [finalizer wait]:
runtime.gopark(0x0?, 0x14004b8e9c0?, 0x0?, 0x60?, 0x1000000010?)
	runtime/proc.go:398 +0xc8 fp=0x1400008c580 sp=0x1400008c560 pc=0x104060668
runtime.runfinq()
	runtime/mfinal.go:193 +0x108 fp=0x1400008c7d0 sp=0x1400008c580 pc=0x10403eba8
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x1400008c7d0 sp=0x1400008c7d0 pc=0x104094d44
created by runtime.createfing in goroutine 1
	runtime/mfinal.go:163 +0x80

goroutine 11 [GC worker (idle)]:
runtime.gopark(0x636e505d105f2?, 0x1?, 0x5c?, 0x77?, 0x0?)
	runtime/proc.go:398 +0xc8 fp=0x1400008e730 sp=0x1400008e710 pc=0x104060668
runtime.gcBgMarkWorker()
	runtime/mgc.go:1293 +0xd8 fp=0x1400008e7d0 sp=0x1400008e730 pc=0x104041708
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x1400008e7d0 sp=0x1400008e7d0 pc=0x104094d44
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1217 +0x28

goroutine 12 [GC worker (idle)]:
runtime.gopark(0x636e505d10dc2?, 0x3?, 0x20?, 0x48?, 0x0?)
	runtime/proc.go:398 +0xc8 fp=0x1400008ef30 sp=0x1400008ef10 pc=0x104060668
runtime.gcBgMarkWorker()
	runtime/mgc.go:1293 +0xd8 fp=0x1400008efd0 sp=0x1400008ef30 pc=0x104041708
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x1400008efd0 sp=0x1400008efd0 pc=0x104094d44
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1217 +0x28

goroutine 24 [GC worker (idle)]:
runtime.gopark(0x636e505d17b4b?, 0x3?, 0xe9?, 0xb9?, 0x0?)
	runtime/proc.go:398 +0xc8 fp=0x14000088730 sp=0x14000088710 pc=0x104060668
runtime.gcBgMarkWorker()
	runtime/mgc.go:1293 +0xd8 fp=0x140000887d0 sp=0x14000088730 pc=0x104041708
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x140000887d0 sp=0x140000887d0 pc=0x104094d44
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1217 +0x28

goroutine 39 [GC worker (idle)]:
runtime.gopark(0x636e505d145bf?, 0x1?, 0x2d?, 0x5?, 0x0?)
	runtime/proc.go:398 +0xc8 fp=0x1400035e730 sp=0x1400035e710 pc=0x104060668
runtime.gcBgMarkWorker()
	runtime/mgc.go:1293 +0xd8 fp=0x1400035e7d0 sp=0x1400035e730 pc=0x104041708
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x1400035e7d0 sp=0x1400035e7d0 pc=0x104094d44
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1217 +0x28

goroutine 40 [GC worker (idle)]:
runtime.gopark(0x636e505d1b801?, 0x1?, 0x14?, 0xb5?, 0x0?)
	runtime/proc.go:398 +0xc8 fp=0x1400035ef30 sp=0x1400035ef10 pc=0x104060668
runtime.gcBgMarkWorker()
	runtime/mgc.go:1293 +0xd8 fp=0x1400035efd0 sp=0x1400035ef30 pc=0x104041708
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x1400035efd0 sp=0x1400035efd0 pc=0x104094d44
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1217 +0x28

goroutine 41 [GC worker (idle)]:
runtime.gopark(0x636e505e40128?, 0x1?, 0x1?, 0x5b?, 0x0?)
	runtime/proc.go:398 +0xc8 fp=0x1400035f730 sp=0x1400035f710 pc=0x104060668
runtime.gcBgMarkWorker()
	runtime/mgc.go:1293 +0xd8 fp=0x1400035f7d0 sp=0x1400035f730 pc=0x104041708
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x1400035f7d0 sp=0x1400035f7d0 pc=0x104094d44
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1217 +0x28

goroutine 42 [GC worker (idle)]:
runtime.gopark(0x636e505e4d7d6?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:398 +0xc8 fp=0x1400035ff30 sp=0x1400035ff10 pc=0x104060668
runtime.gcBgMarkWorker()
	runtime/mgc.go:1293 +0xd8 fp=0x1400035ffd0 sp=0x1400035ff30 pc=0x104041708
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x1400035ffd0 sp=0x1400035ffd0 pc=0x104094d44
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1217 +0x28

goroutine 25 [GC worker (idle)]:
runtime.gopark(0x636e505e45ad6?, 0x1?, 0xcd?, 0xfa?, 0x0?)
	runtime/proc.go:398 +0xc8 fp=0x14000088f30 sp=0x14000088f10 pc=0x104060668
runtime.gcBgMarkWorker()
	runtime/mgc.go:1293 +0xd8 fp=0x14000088fd0 sp=0x14000088f30 pc=0x104041708
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x14000088fd0 sp=0x14000088fd0 pc=0x104094d44
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1217 +0x28

goroutine 13 [chan receive]:
runtime.gopark(0x140000b6fb8?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:398 +0xc8 fp=0x1400008f640 sp=0x1400008f620 pc=0x104060668
runtime.chanrecv(0x140000b6f60, 0x1400008f790, 0x1)
	runtime/chan.go:583 +0x414 fp=0x1400008f6c0 sp=0x1400008f640 pc=0x10402c534
runtime.chanrecv2(0x0?, 0x0?)
	runtime/chan.go:447 +0x14 fp=0x1400008f6f0 sp=0x1400008f6c0 pc=0x10402c104
github.com/buchgr/bazel-remote/v2/utils/backendproxy.StartUploaders.func1()
	github.com/buchgr/bazel-remote/v2/utils/backendproxy/backendproxy.go:30 +0xb0 fp=0x1400008f7d0 sp=0x1400008f6f0 pc=0x1045c3d80
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x1400008f7d0 sp=0x1400008f7d0 pc=0x104094d44
created by github.com/buchgr/bazel-remote/v2/utils/backendproxy.StartUploaders in goroutine 1
	github.com/buchgr/bazel-remote/v2/utils/backendproxy/backendproxy.go:29 +0x70

goroutine 14 [chan receive]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:398 +0xc8 fp=0x1400008fe40 sp=0x1400008fe20 pc=0x104060668
runtime.chanrecv(0x140000b6f60, 0x1400008ff90, 0x1)
	runtime/chan.go:583 +0x414 fp=0x1400008fec0 sp=0x1400008fe40 pc=0x10402c534
runtime.chanrecv2(0x0?, 0x0?)
	runtime/chan.go:447 +0x14 fp=0x1400008fef0 sp=0x1400008fec0 pc=0x10402c104
github.com/buchgr/bazel-remote/v2/utils/backendproxy.StartUploaders.func1()
	github.com/buchgr/bazel-remote/v2/utils/backendproxy/backendproxy.go:30 +0xb0 fp=0x1400008ffd0 sp=0x1400008fef0 pc=0x1045c3d80
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x1400008ffd0 sp=0x1400008ffd0 pc=0x104094d44
created by github.com/buchgr/bazel-remote/v2/utils/backendproxy.StartUploaders in goroutine 1
	github.com/buchgr/bazel-remote/v2/utils/backendproxy/backendproxy.go:29 +0x70

goroutine 15 [chan receive]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:398 +0xc8 fp=0x1400035a640 sp=0x1400035a620 pc=0x104060668
runtime.chanrecv(0x140000b6f60, 0x1400035a790, 0x1)
	runtime/chan.go:583 +0x414 fp=0x1400035a6c0 sp=0x1400035a640 pc=0x10402c534
runtime.chanrecv2(0x0?, 0x0?)
	runtime/chan.go:447 +0x14 fp=0x1400035a6f0 sp=0x1400035a6c0 pc=0x10402c104
github.com/buchgr/bazel-remote/v2/utils/backendproxy.StartUploaders.func1()
	github.com/buchgr/bazel-remote/v2/utils/backendproxy/backendproxy.go:30 +0xb0 fp=0x1400035a7d0 sp=0x1400035a6f0 pc=0x1045c3d80
runtime.goexit()
	runtime/asm_arm64.s:1197 +0x4 fp=0x1400035a7d0 sp=0x1400035a7d0 pc=0x104094d44
created by github.com/buchgr/bazel-remote/v2/utils/backendproxy.StartUploaders in goroutine 1
	github.com/buchgr/bazel-remote/v2/utils/backendproxy/backendproxy.go:29 +0x70
@mostynb
Copy link
Collaborator

mostynb commented Nov 6, 2023

Hi, could you share the flags/config file you're running bazel-remote with? Is this the entire crash output?

@DolceTriade
Copy link

DolceTriade commented Jan 6, 2024

I hit this as well on OSX ARM64

Flags

./bazel-remote --dir /tmp/bazel --max_size 5

Logs

I can't post the full logs because they're 45M. The stacktrace contains the stacktrace of 32k goroutines.
Logs before the crash...

2024/01/06 00:44:27 GRPC CAS HEAD 8eb1f3322cd07a04f7f6e6b2d3e9bf53f7c2611a7649748d35bd7adb36f89bdc OK
2024/01/06 00:44:27 GRPC CAS HEAD 990fb33d8ec4225cf16e6cb414a8da1ff8ac17bb8348ee22d37ab797be4922d5 OK
2024/01/06 00:44:27 GRPC CAS HEAD dfa4599da6b9c68c4cdb311798adb46476440585242069db7942b206cd77a6b3 OK
2024/01/06 00:44:27 GRPC CAS HEAD bb3fcf3f6fad09713f161687ff7ed1f8f289c8142554f1e0214486921af4b595 OK
2024/01/06 00:44:27 GRPC CAS HEAD d94f2fc2f752db237950f180c4a707abb8c7c05ca68077c219b34dca0e67b211 OK
2024/01/06 00:44:28 GRPC BYTESTREAM WRITE COMPLETED: uploads/f0ccdb56-2e3d-4f8e-b03a-be505a6ea252/blobs/aac7cfdddf61c7b7ae93c283cfdc81a229f4ccb43f98c39d20ab3f7a50b6b34c/2753
runtime/cgo: pthread_create failed: Resource temporarily unavailable

Some example traces are:

goroutine 1 [semacquire, 1 minutes]:
runtime.gopark(0x101ef6b00?, 0x100d1ceac?, 0x40?, 0x8e?, 0x18?)
        GOROOT/src/runtime/proc.go:381 +0xe4 fp=0x14000e2f970 sp=0x14000e2f950 pc=0x100d4a264
runtime.goparkunlock(...)
        GOROOT/src/runtime/proc.go:387
runtime.semacquire1(0x14000dae850, 0x9c?, 0x1, 0x0, 0x40?)
        GOROOT/src/runtime/sema.go:160 +0x21c fp=0x14000e2f9d0 sp=0x14000e2f970 pc=0x100d5c3cc
sync.runtime_Semacquire(0x1400039fa48?)
        GOROOT/src/runtime/sema.go:62 +0x2c fp=0x14000e2fa10 sp=0x14000e2f9d0 pc=0x100d79abc
sync.(*WaitGroup).Wait(0x14000dae848)
        GOROOT/src/sync/waitgroup.go:116 +0x78 fp=0x14000e2fa30 sp=0x14000e2fa10 pc=0x100d8a6c8
golang.org/x/sync/errgroup.(*Group).Wait(0x14000dae840)
        external/org_golang_x_sync/errgroup/errgroup.go:53 +0x2c fp=0x14000e2fa50 sp=0x14000e2fa30 pc=0x10123f9bc
main.run(0x14000129880)
        main.go:223 +0xba8 fp=0x14000e2fc30 sp=0x14000e2fa50 pc=0x1014d5e68
github.com/urfave/cli/v2.(*App).RunContext(0x1400016ea80, {0x101916820?, 0x14000196010}, {0x140001ac000, 0x5, 0x5})
        external/com_github_urfave_cli_v2/app.go:395 +0xc04 fp=0x14000e2ff00 sp=0x14000e2fc30 pc=0x10143b1d4
github.com/urfave/cli/v2.(*App).Run(...)
        external/com_github_urfave_cli_v2/app.go:252
main.main()
        main.go:55 +0x138 fp=0x14000e2ff70 sp=0x14000e2ff00 pc=0x1014d5258
runtime.main()
        GOROOT/src/runtime/proc.go:250 +0x248 fp=0x14000e2ffd0 sp=0x14000e2ff70 pc=0x100d49e38
runtime.goexit()
        src/runtime/asm_arm64.s:1172 +0x4 fp=0x14000e2ffd0 sp=0x14000e2ffd0 pc=0x100d7e4c4
goroutine 13028 [chan receive]:
runtime.gopark(0x101787760?, 0x1400a98da40?, 0x60?, 0xa4?, 0x100de79f8?)
        GOROOT/src/runtime/proc.go:381 +0xe4 fp=0x1400a98d7a0 sp=0x1400a98d780 pc=0x100d4a264
runtime.chanrecv(0x1400a74cf60, 0x1400a98da50, 0x1)
        GOROOT/src/runtime/chan.go:583 +0x45c fp=0x1400a98d830 sp=0x1400a98d7a0 pc=0x100d1623c
runtime.chanrecv1(0x1400a74cf00?, 0x0?)
        GOROOT/src/runtime/chan.go:442 +0x14 fp=0x1400a98d860 sp=0x1400a98d830 pc=0x100d15da4
github.com/buchgr/bazel-remote/v2/server.(*grpcServer).Write(0x14000dae940, {0x10191ae10?, 0x1400a76a430})
        server/grpc_bytestream.go:564 +0x4fc fp=0x1400a98dac0 sp=0x1400a98d860 pc=0x1014ab9ec
google.golang.org/genproto/googleapis/bytestream._ByteStream_Write_Handler({0x1018d3fa0?, 0x14000dae940}, {0x101918d68?, 0x1400a2f9770})
        bazel-out/darwin_arm64-fastbuild-ST-eccb913b7463/bin/external/go_googleapis/google/bytestream/bytestream_go_proto_/google.golang.org/genproto/googleapis/bytestream/bytestream.pb.go:709 +0x98 fp=0x1400a98db00 sp=0x1400a98dac0 pc=0x101495338
google.golang.org/grpc.(*Server).processStreamingRPC(0x140001583c0, {0x10191b4f8, 0x140107f3d40}, 0x1400a768480, 0x14000dad0e0, 0x101edc220, 0x0)
        external/org_golang_google_grpc/server.go:1631 +0x1000 fp=0x1400a98de20 sp=0x1400a98db00 pc=0x1011fd650
google.golang.org/grpc.(*Server).handleStream(0x140001583c0, {0x10191b4f8, 0x140107f3d40}, 0x1400a768480, 0x0)
        external/org_golang_google_grpc/server.go:1718 +0x7e4 fp=0x1400a98df50 sp=0x1400a98de20 pc=0x1011feb74
google.golang.org/grpc.(*Server).serveStreams.func1.1()
        external/org_golang_google_grpc/server.go:959 +0x84 fp=0x1400a98dfd0 sp=0x1400a98df50 pc=0x1011f8544
runtime.goexit()
        src/runtime/asm_arm64.s:1172 +0x4 fp=0x1400a98dfd0 sp=0x1400a98dfd0 pc=0x100d7e4c4
created by google.golang.org/grpc.(*Server).serveStreams.func1
        external/org_golang_google_grpc/server.go:957 +0x164

And

goroutine 16113 [select]:
runtime.gopark(0x14004aa8f60?, 0x4?, 0x48?, 0x8d?, 0x14004aa8ea8?)
        GOROOT/src/runtime/proc.go:381 +0xe4 fp=0x14004aa8d00 sp=0x14004aa8ce0 pc=0x100d4a264
runtime.selectgo(0x14004aa8f60, 0x14004aa8ea0, 0x140012d8d20?, 0x0, 0x0?, 0x1)
        GOROOT/src/runtime/select.go:327 +0x690 fp=0x14004aa8e20 sp=0x14004aa8d00 pc=0x100d5b4a0
google.golang.org/grpc/internal/transport.(*http2Server).keepalive(0x1400049e9c0)
        external/org_golang_google_grpc/internal/transport/http2_server.go:1155 +0x188 fp=0x14004aa8fb0 sp=0x14004aa8e20 pc=0x101199618
google.golang.org/grpc/internal/transport.NewServerTransport.func4()
        external/org_golang_google_grpc/internal/transport/http2_server.go:344 +0x28 fp=0x14004aa8fd0 sp=0x14004aa8fb0 pc=0x101192be8
runtime.goexit()
        src/runtime/asm_arm64.s:1172 +0x4 fp=0x14004aa8fd0 sp=0x14004aa8fd0 pc=0x100d7e4c4
created by google.golang.org/grpc/internal/transport.NewServerTransport
        external/org_golang_google_grpc/internal/transport/http2_server.go:344 +0x1528

@mostynb
Copy link
Collaborator

mostynb commented Jan 6, 2024

@DolceTriade: thanks for the extra details. 32k goroutines sounds quite high for a 5G cache, what kind of hardware are you using? Do you have a lot of users/a lot of incoming requests?

@DolceTriade
Copy link

DolceTriade commented Jan 7, 2024

I'm running this solo on my regular M1 Mac laptop for testing purposes. I'm basically just building my company's bazel repository.

After investigating, I think it uploads a lot of Python files very quickly which causes this huge spike in uploads. I haven't tried this yet, but I suspect that if I rate limit uploads using --remote_max_connections=10 on bazel, it might mitigate the issue (it works with other remote caches I've tried). I've crashed with other remote caches too building our repository and I suspect that this is the reason. Bazel's default is 100 max remote connections which doesn't seem that much, but apparently bazel can multiplex each connection for up to 100 uploads, so with 10k simultaneous uploads, it seems feasible that we can hit this situation.

@DolceTriade
Copy link

DolceTriade commented Jan 7, 2024

I can report that passing in --remote_max_connections=10 does indeed mitigate the problem. Ideally bazel-remote should throttle responses and not crash for people who accidentally forget to set the flag.

My testing command was:

bazel test -c opt --bes_backend=grpc://localhost:50332 --bes_results_url="http://localhost:3000/invocation/" --remote_cache=grpc://localhost:9092 --test_env=GO_TEST_WRAP_TESTV=1 --remote_max_connections=10 -- //...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants