Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for maximum replicas per node #37940

Merged
merged 1 commit into from
Dec 24, 2018

Conversation

olljanat
Copy link
Contributor

@olljanat olljanat commented Sep 30, 2018

- What I did
Added support for maximum replicas per node

Second step to be able solve #26259

- How I did it
Addes support for Placement parameter MaxReplicas

TODO:

@kolyshkin
Copy link
Contributor

build fails:

00:07:31.935 Building: bundles/binary-daemon/dockerd-dev
00:08:09.664 # github.com/docker/docker/daemon/cluster/convert
00:08:09.664 daemon/cluster/convert/service.go:249:4: unknown field 'Maxreplicas' in struct literal of type api.Placement

@olljanat
Copy link
Contributor Author

olljanat commented Oct 2, 2018

Sure it will until swarmkit PR have been merged that why this is marked as WIP now.

Test automation set on Moby is quite complex and I'm not able run all these tests on my dev env so that why I prefer to have PR open here which can run those tests for me.

Next steps are:

  • Do requested changes swarmkit PR
  • Make this PR match with that one.
  • Temporarily add commit which takes swarmkit from my repo

After these build and all these tests (should) pass.

@olljanat olljanat force-pushed the replicas-max-per-node branch 2 times, most recently from 68f5387 to 8b32d49 Compare October 2, 2018 19:00
@codecov
Copy link

codecov bot commented Oct 2, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@d147fe0). Click here to learn what that means.
The diff coverage is 0%.

@@            Coverage Diff            @@
##             master   #37940   +/-   ##
=========================================
  Coverage          ?    36.6%           
=========================================
  Files             ?      608           
  Lines             ?    44990           
  Branches          ?        0           
=========================================
  Hits              ?    16467           
  Misses            ?    26245           
  Partials          ?     2278

@olljanat olljanat force-pushed the replicas-max-per-node branch 2 times, most recently from 22d5ad2 to 4c6a698 Compare October 4, 2018 20:58
@olljanat olljanat changed the title WIP: Added support for maximum replicas per node Added support for maximum replicas per node Oct 17, 2018
@olljanat
Copy link
Contributor Author

olljanat commented Oct 17, 2018

Swarmkit PR was merged so I included swarmkit bump to this.

Builds should pass as they was earlier when I tested to include swarmkit changes from my repo.

@cpuguy83 can you review ?

EDIT: Swarmkit bump looks to be also on #38056 so I will drop it from this when it is merged.

EDIT 2: #38056 was merged. Failing test on experimental looks to be flaky ( #32673 ).

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some minor comments, but I want to give this a spin (haven't had time for that yet).

One thing I'd like to test is the behavior if the --max-replicas-per-node value is updated for an existing service (e.g. change from max 3 replicas to max 2; are the existing tasks properly killed and/or re-scheduled in that case?)

daemon/cluster/convert/service.go Show resolved Hide resolved
api/swagger.yaml Show resolved Hide resolved
@olljanat
Copy link
Contributor Author

One thing I'd like to test is the behavior if the --max-replicas-per-node value is updated for an existing service (e.g. change from max 3 replicas to max 2; are the existing tasks properly killed and/or re-scheduled in that case?)

Currently it looks working on way that if user only updates --max-replicas-per-node (or actually it is --replicas-max-per-node so it shorts nicely on cli help) then only service spec will be updated and extra tasks will be killed on next service update. If user also updates --image value or uses --force switch then extra one will be killed immediately (as all tasks are re-scheduled).

If we need change that logic then it would mean one more change to swarmkit.

Anyway, I will wait that #38003 is merged and implement these API versions checks on meanwhile.

@olljanat
Copy link
Contributor Author

Temporarily included commit from #38003 will cleanup after it is merged.

@olljanat
Copy link
Contributor Author

olljanat commented Nov 7, 2018

@thaJeztah rebased on master and included changes you requested. Can you review again?

@olljanat olljanat force-pushed the replicas-max-per-node branch 2 times, most recently from f24d569 to 298228b Compare November 9, 2018 18:45
@olljanat
Copy link
Contributor Author

@cpuguy83 @thaJeztah PTAL, now this contains only simple API test without logic test as it is on swarmkit side.

Copy link
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some minor nits inline (no critical issues, but would be nice to have them in the PR)

I was finally able to give this a spin (see below for some playing with the new option); one thing I noticed (also, see below) that reducing the max-replicas value does not terminate existing tasks. I wondered if this was by design, or an oversight? Not a show-stopper; just curious 🤗

Here's some things I tried:

Create a service with 2 replicas, but maxreplicas 1

curl -s \
  --unix-socket /var/run/docker.sock \
  -X POST \
  "http://localhost/v1.40/services/create" \
  -H "Content-Type: application/json" \
  -d '{"EndpointSpec":{"Mode":"vip"},"Labels":{},"Mode":{"Replicated":{"Replicas":2}},"Name":"test","TaskTemplate":{"ContainerSpec":{"DNSConfig":{},"Image":"nginx:alpine@sha256:2e497c294e3ba84aaeab7a0fbb1027819cd1f5f5892ed3c4a82b8b05010090da"},"ForceUpdate":0,"Placement":{"MaxReplicas":1,"Platforms":[{"Architecture":"amd64","OS":"linux"},{"Architecture":"arm","OS":"linux"},{"Architecture":"arm64","OS":"linux"},{"Architecture":"386","OS":"linux"},{"Architecture":"ppc64le","OS":"linux"},{"Architecture":"s390x","OS":"linux"}]},"Resources":{"Limits":{},"Reservations":{}}}}'

{"ID":"eln7vce863tedwd5q72kjaxkg"}

Verify the MaxReplicas configuration is stored:

curl -s \
  --unix-socket /var/run/docker.sock \
  "http://localhost/v1.40/services/test?insertDefaults=true" | jq .Spec.TaskTemplate.Placement.MaxReplicas

1

And check that no more than 1 replica is running (on a single-node swarm);

docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE               PORTS
qy2qc9p1yder        test                replicated          1/2                 nginx:alpine        

Check the task, and confirm it's unable to start due to the MaxReplicas restriction:

docker service ps test --no-trunc --format 'table {{.Name}}\t{{.Error}}'
NAME                ERROR
test.1              
test.2              "no suitable node (max replicas per node limit exceed)"

Updated MaxReplicas to 2;

Get the service definition

curl -s \
  --unix-socket /var/run/docker.sock \
  "http://localhost/v1.40/services/test"
{"ID":"cqnbsmr9foblwucehzu9ekrpb","Version":{"Index":87},"CreatedAt":"2018-12-23T22:23:22.415002133Z","UpdatedAt":"2018-12-23T22:23:22.415002133Z","Spec":{"Name":"test","Labels":{},"TaskTemplate":{"ContainerSpec":{"Image":"nginx:alpine@sha256:2e497c294e3ba84aaeab7a0fbb1027819cd1f5f5892ed3c4a82b8b05010090da","DNSConfig":{},"Isolation":"default"},"Resources":{"Limits":{},"Reservations":{}},"Placement":{"MaxReplicas":1,"Platforms":[{"Architecture":"amd64","OS":"linux"},{"Architecture":"arm","OS":"linux"},{"Architecture":"arm64","OS":"linux"},{"Architecture":"386","OS":"linux"},{"Architecture":"ppc64le","OS":"linux"},{"Architecture":"s390x","OS":"linux"}]},"ForceUpdate":0,"Runtime":"container"},"Mode":{"Replicated":{"Replicas":2}},"EndpointSpec":{"Mode":"vip"}},"Endpoint":{"Spec":{}}}

Update the service to set MaxReplicas to 2

curl -s \
  --unix-socket /var/run/docker.sock \
  -X POST \
  "http://localhost/v1.40/services/test/update?version=87" \
  -H "Content-Type: application/json" \
  -d '{"Name":"test","Labels":{},"TaskTemplate":{"ContainerSpec":{"Image":"nginx:alpine@sha256:2e497c294e3ba84aaeab7a0fbb1027819cd1f5f5892ed3c4a82b8b05010090da","DNSConfig":{},"Isolation":"default"},"Resources":{"Limits":{},"Reservations":{}},"Placement":{"MaxReplicas":2,"Platforms":[{"Architecture":"amd64","OS":"linux"},{"Architecture":"arm","OS":"linux"},{"Architecture":"arm64","OS":"linux"},{"Architecture":"386","OS":"linux"},{"Architecture":"ppc64le","OS":"linux"},{"Architecture":"s390x","OS":"linux"}]},"ForceUpdate":0,"Runtime":"container"},"Mode":{"Replicated":{"Replicas":2}},"EndpointSpec":{"Mode":"vip"}}'

Verify the service was updated with the new value (2);

curl -s \
  --unix-socket /var/run/docker.sock \
  "http://localhost/v1.40/services/test" | jq .Spec.TaskTemplate.Placement.MaxReplicas

2

And confirm that the service was now able to deploy 2 replicas:

docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE               PORTS
cqnbsmr9fobl        test                replicated          2/2                 nginx:alpine        

Now update the service again, and set MaxReplicas back to 1

curl -s \
  --unix-socket /var/run/docker.sock \
  -X POST \
  "http://localhost/v1.40/services/test/update?version=172" \
  -H "Content-Type: application/json" \
  -d '{"Name":"test","Labels":{},"TaskTemplate":{"ContainerSpec":{"Image":"nginx:alpine@sha256:2e497c294e3ba84aaeab7a0fbb1027819cd1f5f5892ed3c4a82b8b05010090da","DNSConfig":{},"Isolation":"default"},"Resources":{"Limits":{},"Reservations":{}},"Placement":{"MaxReplicas":1,"Platforms":[{"Architecture":"amd64","OS":"linux"},{"Architecture":"arm","OS":"linux"},{"Architecture":"arm64","OS":"linux"},{"Architecture":"386","OS":"linux"},{"Architecture":"ppc64le","OS":"linux"},{"Architecture":"s390x","OS":"linux"}]},"ForceUpdate":0,"Runtime":"container"},"Mode":{"Replicated":{"Replicas":2}},"EndpointSpec":{"Mode":"vip"}}'

Verify the service was updated with the new value (1)

curl -s \
  --unix-socket /var/run/docker.sock \
  "http://localhost/v1.40/services/test" | jq .Spec.TaskTemplate.Placement.MaxReplicas

1

Note that when reducing the maximum number of replicas, existing tasks are not terminated;

docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE               PORTS
cqnbsmr9fobl        test                replicated          2/2                 nginx:alpine        

Another thing that needs to be taken into account is that scaling/updating a service with an older version of the CLI will strip the MaxReplicas. That's the expected behavior, but perhaps something we should mention somewhere in the documentation :thinking_face:

docker service scale test=3
test scaled to 3

After scaling, the MaxReplicas property is reset (because the CLI does not know the new property);

curl -s \
  --unix-socket /var/run/docker.sock \
  "http://localhost/v1.40/services/test?insertDefaults=true" | jq .Spec.TaskTemplate.Placement.MaxReplicas

null

api/server/router/swarm/cluster_routes.go Show resolved Hide resolved
docs/api/version-history.md Outdated Show resolved Hide resolved
api/swagger.yaml Outdated Show resolved Hide resolved
api/swagger.yaml Show resolved Hide resolved
@olljanat
Copy link
Contributor Author

olljanat commented Dec 23, 2018

one thing I noticed (also, see below) that reducing the max-replicas value does not terminate existing tasks. I wondered if this was by design, or an oversight? Not a show-stopper; just curious 🤗

By design as we don't want kill running containers when it is not absolutely necessary and user have option to use --force (not sure what it is on API side) and then all tasks will be re-scheduled and situation fixed.

Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
@olljanat
Copy link
Contributor Author

@thaJeztah updated based on review. Now hopefully ready to be merged?

@thaJeztah
Copy link
Member

By design as we don't want kill running containers when it is not absolutely necessary and user have option to use --force (not sure what it is on API side) and then all tasks will be re-scheduled and situation fixed.

Gotcha; makes sense. We'll have to document that, and explain the behavior (i.e., if they want to force a reschedule, use the --force)

Trying that scenario:

Create a service with MaxReplicas: 2, and Replicas: 2

curl -s \
  --unix-socket /var/run/docker.sock \
  -X POST \
  "http://localhost/v1.40/services/create" \
  -H "Content-Type: application/json" \
  -d '{"EndpointSpec":{"Mode":"vip"},"Labels":{},"Mode":{"Replicated":{"Replicas":2}},"Name":"test","TaskTemplate":{"ContainerSpec":{"DNSConfig":{},"Image":"nginx:alpine@sha256:2e497c294e3ba84aaeab7a0fbb1027819cd1f5f5892ed3c4a82b8b05010090da"},"ForceUpdate":0,"Placement":{"MaxReplicas":2,"Platforms":[{"Architecture":"amd64","OS":"linux"},{"Architecture":"arm","OS":"linux"},{"Architecture":"arm64","OS":"linux"},{"Architecture":"386","OS":"linux"},{"Architecture":"ppc64le","OS":"linux"},{"Architecture":"s390x","OS":"linux"}]},"Resources":{"Limits":{},"Reservations":{}}}}'

{"ID":"kdu0l3uxhk6p52xa719581h0h"}

Verify the MaxReplicas configuration is stored:

curl -s \
  --unix-socket /var/run/docker.sock \
  "http://localhost/v1.40/services/test" | jq .Spec.TaskTemplate.Placement

2

Confirm the service has two tasks running:

docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE               PORTS
rmg5iwvxl158        test                replicated          2/2                 nginx:alpine        

Update MaxReplicas to 2, and set ForceUpdate to a new value (to force the update);

Get the service definition

curl -s \
  --unix-socket /var/run/docker.sock \
  "http://localhost/v1.40/services/test"
{"ID":"kdu0l3uxhk6p52xa719581h0h","Version":{"Index":27},"CreatedAt":"2018-12-24T09:37:12.633108733Z","UpdatedAt":"2018-12-24T09:37:12.633108733Z","Spec":{"Name":"test","Labels":{},"TaskTemplate":{"ContainerSpec":{"Image":"nginx:alpine@sha256:2e497c294e3ba84aaeab7a0fbb1027819cd1f5f5892ed3c4a82b8b05010090da","DNSConfig":{},"Isolation":"default"},"Resources":{"Limits":{},"Reservations":{}},"Placement":{"MaxReplicas":2,"Platforms":[{"Architecture":"amd64","OS":"linux"},{"Architecture":"arm","OS":"linux"},{"Architecture":"arm64","OS":"linux"},{"Architecture":"386","OS":"linux"},{"Architecture":"ppc64le","OS":"linux"},{"Architecture":"s390x","OS":"linux"}]},"ForceUpdate":0,"Runtime":"container"},"Mode":{"Replicated":{"Replicas":2}},"EndpointSpec":{"Mode":"vip"}},"Endpoint":{"Spec":{}}}

Update the service, set MaxReplicas to 1, and set ForceUpdate to a new value (to force the update);

curl -s \
  --unix-socket /var/run/docker.sock \
  -X POST \
  "http://localhost/v1.40/services/test/update?version=27" \
  -H "Content-Type: application/json" \
  -d '{"Name":"test","Labels":{},"TaskTemplate":{"ContainerSpec":{"Image":"nginx:alpine@sha256:2e497c294e3ba84aaeab7a0fbb1027819cd1f5f5892ed3c4a82b8b05010090da","DNSConfig":{},"Isolation":"default"},"Resources":{"Limits":{},"Reservations":{}},"Placement":{"MaxReplicas":1,"Platforms":[{"Architecture":"amd64","OS":"linux"},{"Architecture":"arm","OS":"linux"},{"Architecture":"arm64","OS":"linux"},{"Architecture":"386","OS":"linux"},{"Architecture":"ppc64le","OS":"linux"},{"Architecture":"s390x","OS":"linux"}]},"ForceUpdate":1,"Runtime":"container"},"Mode":{"Replicated":{"Replicas":2}},"EndpointSpec":{"Mode":"vip"}}'

{"Warnings":null}

Confirm that the service now has 1 task running:

docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE               PORTS
kdu0l3uxhk6p        test                replicated          1/2                 nginx:alpine      

And that this is due to the new MaxReplicas limit:

docker service ps test --no-trunc --format 'table {{.Name}}\t{{.Error}}'
NAME                ERROR
test.1              
test.2              "no suitable node (max replicas per node limit exceed)"
 \_ test.2          

Looking good 👍

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!!!

@thaJeztah
Copy link
Member

@olljanat was the failure only a flaky test? If so, I'm happy to ignore PowerPC and merge

@olljanat
Copy link
Contributor Author

@thaJeztah yes. Those which I disabled on #38423 looks causing issues.

@thaJeztah
Copy link
Member

Alright let's get this one merged 👍

@thaJeztah thaJeztah merged commit 8fbf259 into moby:master Dec 24, 2018
@olljanat olljanat deleted the replicas-max-per-node branch December 25, 2018 04:00
@vpotseluyko
Copy link

Hi! I greatly love this commit. With what docker & swarm-kit version could it be actually used at production

@thaJeztah
Copy link
Member

thaJeztah commented Feb 13, 2019

It's not released yet; will be in the next docker release (19.03) (but you should be able to give it a spin with the nightly builds https://download.docker.com/linux/ubuntu/dists/bionic/pool/nightly/amd64/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants