New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows: Run containerd as managed process #43893
base: master
Are you sure you want to change the base?
Conversation
37fc37a
to
1c7add6
Compare
.github/workflows/windows.yml
Outdated
@@ -346,10 +346,14 @@ jobs: | |||
name: Starting test daemon | |||
run: | | |||
Write-Host "Creating service" | |||
If ("${{ matrix.runtime }}" -eq "containerd") { | |||
If ("${{ matrix.runtime }}" -eq "containerd" -and "${{ matrix.os }}" -eq "windows-2019") { | |||
$runtimeArg="--containerd=\\.\pipe\containerd-containerd" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that the remote.Debug.Address
no longer defaults to the system containerd, should we also pass the system container's debug socket on this CLI? I haven't looked to see how that socket is used in the tests or otherwise, this is just flagging from the change in libcontainerd/supervisor/remote_daemon_windows.go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the Unix-side code doesn't do this either, perhaps this debug socket is only used for a locally-started containerd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found this comment from containerd code DefaultDebugAddress is the default winpipe address for pprof data
so I understand that it purely for debugging but I have no idea how to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have the source in front of me, but GitHub search is telling me that Docker doesn't ever use the socket. Either I'm just not searching well enough, or some refactoring has disappeared the use-cases for this and no one noticed? Or it's actually supposed to be used by someone running ctr pprof
or another tool entirely, not by Docker itself.
Anyway, still browsing GitHub, Docker's libcontainerd/supervisor always passes in a Debug socket (pointing to a private socket in the changes you've made here parallel to the private containerd socket), and when we start containerd in the system test, without passing one in, it doesn't start listening (i.e. default is "" on Windows, Linux and Solaris).
So what we have now is fine, since we don't try and collect the pprof data in the test run, and we shouldn't see any conflicts between system-containerd and docker-private containerd instances over the socket name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be a slight separate bug related to this, as monitorDaemon
deletes the GRPC socket, but not the Debug socket when starting/restarting containerd. Probably not an issue, but if we start seeing bundled-containerd startups/restarts failing because the pprof socket already exists, then that'll be why. (Not sure if npipe semantics on Windows are different from unix sockets in this regard, my recollection is that neither requires explicit removal after the ends are closed, so removing the GRPC socket in that code might just be bonus safety).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then there's #43901.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have the source in front of me, but GitHub search is telling me that Docker doesn't ever use the socket.
Correct, we don't use the socket; I think the location is set for two reasons;
- To prevent containerd using the default location, which could conflict with other running containerd instances (in situations where multiple dockerd daemons are started on a single machine)
- Because (in the current situation), dockerd always creates the
containerd.toml
configuration file, which means that there's no way to (if needed) configure the debug-socket for debugging.
There might be a slight separate bug related to this, as
monitorDaemon
deletes the GRPC socket, but not the Debug socket when starting/restarting containerd.
I noticed that as well when working on #43946, and want to dig a bit;
It's a bit odd for dockerd
to remove the socket (as the socket would be created by containerd, not docker (so from that perspective one would expect containerd to also clean it up). However, the supervisor is somewhat of a bare-bones process-manager, and (this is the part I want to dig into history), there may have been situations where containerd wasn't shut-down cleanly and the containerd.sock
not cleaned up, causing the daemon to fail to start.
For the cleanup of the debug-socket, perhaps it was left, as the daemon wouldn't try to open that file, so cleaning up/re-creating was left to containerd.
cmd/dockerd/daemon_windows.go
Outdated
func (cli *DaemonCli) initContainerD(_ context.Context) (func(time.Duration) error, error) { | ||
func (cli *DaemonCli) initContainerD(ctx context.Context) (func(time.Duration) error, error) { | ||
if cli.Config.ContainerdAddr == "" && cli.Config.DefaultRuntime == config.WindowsV2RuntimeName { | ||
cli.Config.ContainerdAddr = supervisor.GrpcPipeName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll note that the Unix-side implementation attempts to detect a running system containerd first; this is feasible in Windows from I've read on Stack Overflow, but probably not as trivial as the Unix-side os.Lstat
. That might be a nice improvement for the next iteration, as then the runtime-args test path could be reunified to just pass --default-runtime=io.containerd.runhcs.v1
in both system containers (on default pipe) and managed containerd (on static private pipe)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I noticed that. However there containerd was taken to use on very early phase.
Windows containerd is already in used with nerdctl and Kubernetes on some environments and if we want to avoid issue which you mentioned in #43892 (comment)
A specific expectation/benefit for me of this is that the zip includes the exact containerd version which was used in CI, reducing the scope for mysterious bugs.
also on those environments it might be better to use this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that comment, I was thinking more about "static" binary downloads, which I think have a fairly focused use-case of an all-in-one download with no external dependencies or installers.
Personally I'm undecided whether I would expect a bundled-containerd in, the successor of DockerMSFTProvider, or I'd expect it to depend on another DockerContainerdProvider. (I guess a winget setup will be the likely successor, which implies someone builds an MSI for this, and so they are making the decisions or extending the options anyway). I spent a long time in Debian where the answer would be "Of course they should be separate packages", but I recognise that this may be a somewhat-fringe feeling in the real world. ^_^
Personally I like to be able to mix-and-match versions in my own processes, but lock down versions in the outputs. But I have the luxury of time to do that. If I just needed something working now, I'd just grab one of the versions of containerd which is recorded as "tested against" in the docs or CI of the dockerd I'm grabbing.
I think I'd be somewhat surprised if I installed nerdctl and Docker on my dev box, and found them talking to different containerd instances. Namespaces, sure, but I expect containerd to have enough API stability that they'd both work when talking to the higher of the two dependencies, modulo bugs.
containerd-cri and production hosting deployments are a different question though.
Side-note: I've previously been given a commercial software installer that relied upon being able to call Docker on the node in order to build images used by its Pods, and never pushed them to a registry. The k8s dockershim deprecation of course changed that calculus, and the software in question migrated to Helm charts later anyway. I dunno if there's many other cases like that around which expect a given host to have only one container runtime instance (used as a sharing point between k8s and Docker/BuildKit), but I thought I saw discussion of stuff like that during early dockershim-deprecation discussions, pointing out that you could still install Docker on the nodes, mount the socket into a container in a Pod, and build images, and it'd talk to the containerd so flows like that weren't inherently broken by the dockershim deprecation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I like to be able to mix-and-match versions in my own processes, but lock down versions in the outputs.
Yea, I have noticed that you looks to have a lot of time for stuff which is nice. I don't but I still would like to see that Windows + containerd combination to be usable enough that people can start using it on dev environments and provide feedback.
I dunno if there's many other cases like that around which expect a given host to have only one container runtime instance (used as a sharing point between k8s and Docker/BuildKit)
There is no much to share nowadays as Docker uses only graphdriver (at least Windows implementation, not sure what is status with migrating Linux parts) and containerd uses snapshotter so those cannot share images. Look https://blog.mobyproject.org/where-are-containerds-graph-drivers-145fc9b7255
.github/workflows/windows.yml
Outdated
@@ -323,7 +323,7 @@ jobs: | |||
} | |||
- | |||
name: Starting containerd | |||
if: matrix.runtime == 'containerd' | |||
if: matrix.runtime == 'containerd' && matrix.os == 'windows-2019' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know GitHub Actions that well, but would it be clearer to have the runtime
axis with three values (builtin
, managed-containerd
, system-containerd
) and also flag at that point to skip running windows-2019/system-containerd
and windows-2022/managed-containerd
? That will also make it easier to adjust for "Windows 2025" for example. (Playing the long game. ^_^)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, there's an exclude
parameter for the matrix definition..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using include
pattern in the matrix to run specific fields would also work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. However I think that there is other open questions here and #43892 to be answered first before I will continue this further.
Quick blurb from discussing this in the maintainers meeting; there were some discussions recently about wether or not we want to continue having the "containerd as child process" option, or if we should deprecate it, so I wanted to check with other maintainers if we'd be good with this PR.
TL;DR; generally, the features added in this PR is ok (but we didn't review the code) 👍 |
I opened #43946 to implement that suggestion Let me try giving this PR a look soon (I'll also do a quick try if (where) it conflicts with #43946) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I left some thoughts/ramblings 😅 - happy to hear your thoughts.
I also had a quick try locally to rebase this PR on top of #43945, which only gave a small conflict, so not troublesome.
if runtime.GOOS == "windows" { | ||
dockerdPath, err := os.Executable() | ||
if err == nil { | ||
binPath = filepath.Dir(dockerdPath) + `\` + binaryName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course only "theoretical" as we're concatenating a const, but perhaps slightly more defensive to use filepath.Join()
binPath = filepath.Dir(dockerdPath) + `\` + binaryName | |
binPath = filepath.Join(filepath.Dir(dockerdPath), binaryName) |
// In Windows containerd was first adopted by other tools so make sure that we | ||
// use only containerd.exe which is provided with Docker binaries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit on the fence on this part. I can see pros (install them together and be sure we're not running some other version), but also "cons"; as I wrote on #43892 (comment), we're looking at decoupling the docker and containerd packages more, and perhaps we should consider to be flexible as well on "how" someone installed containerd (which could be through some other package).
TL;DR it's not a "yes" or "no", but perhaps it's good to move this part of the PR to a separate PR and discuss it separately (I think this PR would still work without this change, so moving it separate would prevent it from being blocked pending that discussion)
.github/workflows/windows.yml
Outdated
@@ -346,10 +346,14 @@ jobs: | |||
name: Starting test daemon | |||
run: | | |||
Write-Host "Creating service" | |||
If ("${{ matrix.runtime }}" -eq "containerd") { | |||
If ("${{ matrix.runtime }}" -eq "containerd" -and "${{ matrix.os }}" -eq "windows-2019") { | |||
$runtimeArg="--containerd=\\.\pipe\containerd-containerd" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have the source in front of me, but GitHub search is telling me that Docker doesn't ever use the socket.
Correct, we don't use the socket; I think the location is set for two reasons;
- To prevent containerd using the default location, which could conflict with other running containerd instances (in situations where multiple dockerd daemons are started on a single machine)
- Because (in the current situation), dockerd always creates the
containerd.toml
configuration file, which means that there's no way to (if needed) configure the debug-socket for debugging.
There might be a slight separate bug related to this, as
monitorDaemon
deletes the GRPC socket, but not the Debug socket when starting/restarting containerd.
I noticed that as well when working on #43946, and want to dig a bit;
It's a bit odd for dockerd
to remove the socket (as the socket would be created by containerd, not docker (so from that perspective one would expect containerd to also clean it up). However, the supervisor is somewhat of a bare-bones process-manager, and (this is the part I want to dig into history), there may have been situations where containerd wasn't shut-down cleanly and the containerd.sock
not cleaned up, causing the daemon to fail to start.
For the cleanup of the debug-socket, perhaps it was left, as the daemon wouldn't try to open that file, so cleaning up/re-creating was left to containerd.
cmd/dockerd/daemon_windows.go
Outdated
func (cli *DaemonCli) initContainerD(ctx context.Context) (func(time.Duration) error, error) { | ||
if cli.Config.ContainerdAddr == "" && cli.Config.DefaultRuntime == config.WindowsV2RuntimeName { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heads up that I'm changing this function to initContainerd
(lowercase) in #43945. In that PR I also removed the .Config
from these, to make it slightly less verbose.
Perhaps this should use an early return here?
if cli.ContainerdAddr != "" {
// use system containerd at the given address.
return nil, nil
}
if cli.Config.DefaultRuntime != config.WindowsV2RuntimeName {
// containerd is only used if the V2 (io.containerd.runhcs.v1) runtime is used.
return nil, nil
}
As to the config.WindowsV2RuntimeName
check; I guess we currently use that as switch to toggle between containerd and the built-in runtime, so makes sense; slightly wondering if (in light of #43946), we might run into issues if alternative runtime may come to exist (not really urgent I guess).
cmd/dockerd/daemon_windows.go
Outdated
func (cli *DaemonCli) initContainerD(_ context.Context) (func(time.Duration) error, error) { | ||
func (cli *DaemonCli) initContainerD(ctx context.Context) (func(time.Duration) error, error) { | ||
if cli.Config.ContainerdAddr == "" && cli.Config.DefaultRuntime == config.WindowsV2RuntimeName { | ||
cli.Config.ContainerdAddr = supervisor.GrpcPipeName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious about the;
cli.Config.ContainerdAddr = supervisor.GrpcPipeName
Wouldn't that be set as part of setDefaults()
?
func (r *remote) setDefaults() { |
Or was this specifically for the initContainerdRuntime()
?
cmd/dockerd/daemon_windows.go
Outdated
func (cli *DaemonCli) initContainerD(ctx context.Context) (func(time.Duration) error, error) { | ||
if cli.Config.ContainerdAddr == "" && cli.Config.DefaultRuntime == config.WindowsV2RuntimeName { | ||
cli.Config.ContainerdAddr = supervisor.GrpcPipeName | ||
system.InitContainerdRuntime(cli.Config.ContainerdAddr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^^ related to my previous comment; if we set it as part of setDefaults()
, then this could be moved to the end of this function, and use r.Address()
With that, we also wouldn't have to export supervisor.GrpcPipeName
cmd/dockerd/daemon_windows.go
Outdated
system.InitContainerdRuntime(cli.Config.ContainerdAddr) | ||
|
||
var waitForShutdown func(time.Duration) error | ||
logrus.Debug("Starting daemon managed containerd") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #43945, I changed this to
managed containerd
(removed "daemon"), and made the logs all lowercase- changed
Debug
toInfo
(for consistency with some other things we start when starting the daemon)
Perhaps something like;
logrus.Info("starting managed containerd")
Or we could even use the same log;
logrus.Info("containerd not running, starting managed containerd")
cmd/dockerd/daemon_windows.go
Outdated
if err != nil { | ||
return nil, errors.Wrap(err, "failed to start containerd") | ||
} | ||
logrus.Debug("Started daemon managed containerd") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While working on #43945, I noticed that this log was redundant (remote. startContainerd()
already logs when it's started, including PID
, and address
), so this line can be removed.
cmd/dockerd/daemon_windows.go
Outdated
waitForShutdown = r.WaitTimeout | ||
return waitForShutdown, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
waitForShutdown
is only used here, so this could be inlined (and the var further up removed);
return r.WaitTimeout, nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With my suggestions applied, this function would look like;
func (cli *DaemonCli) initContainerd(ctx context.Context) (func(time.Duration) error, error) {
if cli.ContainerdAddr != "" {
// use system containerd at the given address.
system.InitContainerdRuntime(cli.ContainerdAddr)
return nil, nil
}
if cli.DefaultRuntime != config.WindowsV2RuntimeName {
return nil, nil
}
logrus.Info("containerd not running, starting managed containerd")
opts, err := cli.getContainerdDaemonOpts()
if err != nil {
return nil, errors.Wrap(err, "failed to generate containerd options")
}
r, err := supervisor.Start(ctx, filepath.Join(cli.Root, "containerd"), filepath.Join(cli.ExecRoot, "containerd"), opts...)
if err != nil {
return nil, errors.Wrap(err, "failed to start containerd")
}
cli.ContainerdAddr = r.Address()
system.InitContainerdRuntime(r.Address())
// Try to wait for containerd to shutdown
return r.WaitTimeout, nil
}
And the diff between Windows and Linux getting really small (maybe even up to a level that we could abstract some of the platform specifics);
grpcPipeName = `\\.\pipe\containerd-containerd` | ||
debugPipeName = `\\.\pipe\containerd-debug` | ||
binaryName = "containerd.exe" | ||
GrpcPipeName = `\\.\pipe\docker-containerd` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my other comment; possibly we don't need to export this one
daemon/config/config_windows.go
Outdated
// This is used by the `default-runtime` flag in dockerd as the default value. | ||
// On windows we'd prefer to keep this empty so the value is auto-detected based on other options. | ||
StockRuntimeName = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this bit again, and wondering if we should continue "not using" this for Windows, or if we should put com.docker.hcsshim.v1
here (default (for now) to be the built-in runtime, and io.containerd.runhcs.v1
to be the "opt in" for containerd). Where at some point we can change the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@olljanat I rebased my local branch and added the suggestions I made above (including splitting that one block to a separate commit); if useful, you can reset your branch to that one; https://github.com/moby/moby/compare/master..thaJeztah:windows_containerd_as_proces_rebase_and_suggestions?expand=1 (also happy to push to your PR here; let me know) |
@thaJeztah please do so. I will try stay away from keyboard next three weeks (after tomorrow) and most probably don't have time to push those changes in place before that. |
Assuming that's for Holiday/Vacation, enjoy! 🌴 |
oh! let me try to update this PR branch; hold on |
1c7add6
to
babec53
Compare
done 👍 |
Oh, I forgot that you can update other's PRs. |
Yes! I don't always use it, but in this case it's probably good to preserve the discussion 👍 I also pushed #43951 with the bit that I removed from this one (and the part that may need further discussion) |
8341bb8
to
fcbfe99
Compare
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Prefer a containerd binary that's installed next to the dockerd binary. This prevents running a containerd executable that was installed through other means instead of the bundled one, in cases where the docker is installed using the static binaries. Also see this MicroSoft blog-post about this approach; https://devblogs.microsoft.com/oldnewthing/20110620-00/?p=10393 This patch impements a WithDetectLocalBinary() DaemonOpt, which is set by default on Windows (we can consider using the same on other platforms). Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
fcbfe99
to
d312dae
Compare
Yea. Btw I rebased this now without changes as there was conflicting changes in |
@thaJeztah Should this be included and backported to 23.0 or was there something unclear still? |
Discussed in today's maintainer's call: This is something we'd like to see, but we don't want to rush it for 23.0.0. The talking points in short are:
|
@neersighted dockerd.exe installed as part of Mirantis Container Runtime uses containerd already? Since when? Working code which supports that does not exist on Moby before 23.0.0 ( #42164 #41479 #42089 #43866 ) and there is multiple known issues which why lot of tests are disabled on that combination #41455 So I'm quite sure that only reason why MCR installs containerd is that because nowadays Kubernetes uses that directly without Docker. Reason to rush this one to 23.0.0 is finally provide first version of where users can start easily pilot containerd configuration. |
Right, we're not actively running containers using the runhcs runtime (that lives on a branch and will likely not be supported commercially for some time) -- what I mean to say is that unexpected things will happen for users (due to two different mechanisms attempting to start containerd on the same socket) based on @thaJeztah's summary of the state of this PR (I have not reviewed it). It's only one of the bullet points as well -- the main takeaway is that the risk seems relatively high, the maintainers who have reviewed this in the past aren't confident in the state today, and we'd rather include this in a patch release as an experimental/power-user feature vs. delay the release of 23.0.0 or rush it in half-baked to a release that has had multiple RCs already. |
Bump -- now that 23.0.0 is out, hopefully we'll have some bandwidth to review this. How do you feel about the idea of factoring out the common Linux code into a (partially) shared implementation in this PR? |
Signed-off-by: Charity Kathure <ckathure@microsoft.com>
Signed-off-by: Charity Kathure <ckathure@microsoft.com>
Signed-off-by: Charity Kathure <ckathure@microsoft.com>
Signed-off-by: Charity Kathure <ckathure@microsoft.com>
|
Signed-off-by: Charity Kathure <ckathure@microsoft.com>
Signed-off-by: Charity Kathure <ckathure@microsoft.com>
- What I did
Continued from where #42089 left so now users can switch to containerd by just adding
to
daemon.config
without need separately install or configure containerd.Needs that containerd binaries are included to static binaries installation package #43892
Also need to be backported to 22.06.x if this approach is accepted.
- How I did it
Copied logic from Linux implementation and updated. In additionally used non-default named pipe so it is possible to that both Docker with containerd and separate containerd instance on same machine.
- How to verify it
CI was modified on way that it run Windows 2019 + containerd tests like before and Windows 2022 + containerd tests using this new approach.
Additionally manual tests can be done by
dockerd.exe -D
and check that you seeStarting daemon managed containerd
message.- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)