Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firecracker socket resets connection on shutdown #228

Open
sriramsv opened this issue Apr 22, 2020 · 7 comments
Open

Firecracker socket resets connection on shutdown #228

sriramsv opened this issue Apr 22, 2020 · 7 comments

Comments

@sriramsv
Copy link

while trying out the machine.Shutdown() call from the go-sdk, the firecracker socket resets the connection and also shuts down the VM disgracefully due to the error.

I also used the new ForwardSignals set to []os.Signal{} to make sure firecracker does not listen to any interrupt signals coming from the host operating system.

Unable to send CtrlAltDel: Put "http://localhost/actions": read unix @->/run/firecracker.socket: read: connection reset by peer  component=microvm-0
DEBU[2020-04-21T20:20:32] instance Destroy()                            component="microvm-0 (867e7d58)" error="Put \"http://localhost/actions\": read unix @->/run/firecracker.socket: read: connection reset by peer"
INFO[2020-04-21T20:20:32] Finished shutdown of VMM                      component=vmm
WARN[2020-04-21T20:20:33] firecracker exited: signal: interrupt         component=microvm-0
DEBU[2020-04-21T20:20:33] closing the exitCh 1 error occurred:
	* signal: interrupt
  component=microvm-0

However when I try using the same kernel image with firectl, I am able to cleanly shutdown using the curl command to the firecracker socket

ps aux | grep firecracker
sriramv+ 12276  7.4  0.5 530096 40892 pts/0    Sl+  00:57   0:05 /usr/local/bin/firecracker --api-sock /home/sriramvenkat/.firecracker.sock-12268-81
sriramv+ 12366  0.0  0.0 112712   976 pts/1    S+   00:58   0:00 grep --color=auto firecracker
[sriramvenkat@containedwaned ~]$ curl --unix-socket  /home/sriramvenkat/.firecracker.sock-12268-81 -i     -X PUT "http://localhost/actions"     -H  "accept: application/json"     -H  "Content-Type: application/json"     -d "{
>              \"action_type\": \"SendCtrlAltDel\"
>     }"
HTTP/1.1 204
Server: Firecracker API
Connection: keep-alive
[sriramvenkat@containedwaned ~]$ ps aux | grep firecracker
sriramv+ 12276  5.3  0.5 530100 40892 pts/0    Sl+  00:57   0:05 /usr/local/bin/firecracker --api-sock /home/sriramvenkat/.firecracker.sock-12268-81
sriramv+ 12391  0.0  0.0 112712   976 pts/1    S+   00:59   0:00 grep --color=auto firecracker
@kzys
Copy link
Contributor

kzys commented Jun 15, 2020

What is the configuration of your Linux kernel? Shutdown internally uses SendCtrlDel, which is not supported by some Linux kernels. See firecracker-microvm/firecracker#1095 for the detail.

@alakesh
Copy link
Contributor

alakesh commented Apr 23, 2021

@sriramsv Do you still see this issue?

@alexellis
Copy link

alexellis commented Apr 22, 2024

I am experiencing the same behaviour as described here.

Running the above curl command against the socket causes a reboot and a clean shutdown:

[  OK  ] Stopped Create Static Device Nodes in /dev.
[  OK  ] Stopped Create System Users.
[  OK  ] Stopped Remount Root and Kernel File Systems.
[  OK  ] Reached target System Shutdown.
[  OK  ] Reached target Late Shutdown Services.
[  OK  ] Finished System Reboot.
[  OK  ] Reached target System Reboot.
[   39.516404] reboot: Restarting system
2024-04-22T15:08:31.644600842 [anonymous-instance:main] Vmm is stopping.
2024-04-22T15:08:31.644866100 [anonymous-instance:main] Vmm is stopping.
2024-04-22T15:08:31.775123014 [anonymous-instance:main] Firecracker exiting successfully. exit_code=0

When using the SDK and calling Shutdown(), I see:

ERRO[0013] Unable to send CtrlAltDel: Put "http://localhost/actions": read unix @->/var/run/sockets/firecracker-621912026/firecracker.sock: read: connection reset by peer 
2024/04/22 15:09:17 Error shutting down of-k3s-1: Put "http://localhost/actions": read unix @->/var/run/sockets/firecracker-621912026/firecracker.sock: read: connection reset by peer

In the guest, I see no new output on the serial console.

firecracker.Machine

// Shutdown requests a clean shutdown of the VM by sending CtrlAltDelete on the virtual keyboard
func (m *Machine) Shutdown(ctx context.Context) error {
	m.logger.Debug("Called machine.Shutdown()")
	return m.sendCtrlAltDel(ctx)
}

For some reason, there have been no SDK releases since Sept 2022, but I've tried using that and the latest commit in the main branch. I got the same results in either case.

Firecracker v1.7.0

@alexellis
Copy link

@swagatbora90 would you have any suggestions here? I saw you'd made commits into the SDK repository recently, feel free to ping someone else if needed.

@alexellis
Copy link

alexellis commented Apr 22, 2024

It seems like the bug is potentially a regression with signal passing, since when I invoke the same codepath through an internal HTTP endpoint, the shutdown method works as expected.

When I run control + c on the controller program, the signal appears to be propagated directly to Firecracker causing it to exit immediately meaning it's unable to handle the HTTP request.

I overrode the list of signals to make sure that nothing that would cause a termination would get passed through and can see that in the Go SDK:

DEBU[0001] Setting up signal handler: [user defined signal 1] 

The debug logs show that this signal propagation is faulty, despite only being configured to send through USR1, the control + c went to the Firecracker process and terminated it before the controller could clean it up.

DEBU[0061] Called machine.Shutdown()                    
WARN[0061] firecracker exited: signal: interrupt        

@bduffany
Copy link
Contributor

bduffany commented May 8, 2024

When I run control + c on the controller program, the signal appears to be propagated directly to Firecracker causing it to exit immediately meaning it's unable to handle the HTTP request.

When you press Ctrl+C in the terminal, it sends the signal to all processes in the current process group rather than just a single process. I think the Go SDK would need to allow spawning firecracker/jailer processes with syscall.SysProcAttr{Setpgid: true} so that firecracker/jailer can be placed in their own process group, which would prevent them from getting signaled. Currently the SDK doesn't provide any options for customizing the exec.Cmd for firecracker/jailer, so there would need to be a patch to allow this

@alexellis
Copy link

That's a fair point, and what we eventually found out too. Thanks for commenting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants