Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster: set bigger grpc limit for array requests #38103

Merged
merged 1 commit into from Oct 31, 2018

Conversation

tonistiigi
Copy link
Member

Add bigger limits in the swarmkit calls that can return a lot of data. This is a regression in the grpc library update where the previous version didn't have a limit in this case while new has a 4MB limit. https://github.com/grpc/grpc-go/pull/1165/files#diff-e1550a73f5d25064c8b586ec68d81a64R105

This can be followed up by adding better defaults in swarmkit on the dialer level.

@anshulpundir @andrewhsu

Signed-off-by: Tonis Tiigi tonistiigi@gmail.com

Copy link
Member

@andrewhsu andrewhsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

looks like this will also address #37997 (to some extent)

@cpuguy83
Copy link
Member

Can someone please comment on these issues with the analysis?

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually; @tonistiigi could you update the commit description to include some information about this change, and explain why we have to set this limit now?

i.e., that it was previously not limited by grpc-go, but as of grpc/grpc-go@50d4175 / grpc/grpc-go@6f8b553 (grpc/grpc-go#1165) is.

Looking at git blame, it looks like this change was brought in as part of the containerd 1.1 bump in commit 52ed3e0 (#36895), which is part of Docker 18.06 and up.

@codecov
Copy link

codecov bot commented Oct 30, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@b8e87cf). Click here to learn what that means.
The diff coverage is 0%.

@@            Coverage Diff            @@
##             master   #38103   +/-   ##
=========================================
  Coverage          ?    36.1%           
=========================================
  Files             ?      610           
  Lines             ?    45216           
  Branches          ?        0           
=========================================
  Hits              ?    16327           
  Misses            ?    26650           
  Partials          ?     2239

@tonistiigi
Copy link
Member Author

@thaJeztah Updated commit description with the same message as in the PR description

Copy link
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you

@cpuguy83
Copy link
Member

Opened to address issues with large list operations moby/swarmkit#2774

@thaJeztah
Copy link
Member

Alternative approach; moby/swarmkit#2773

@tonistiigi
Copy link
Member Author

tonistiigi commented Oct 30, 2018

@andrewhsu I'm not really sure what the highest limit is that users could reach under heavy load so maybe I'll just restore it to maxuint32 (instead of 32MB) like it was before(before 18.06), effectively disabling all limits. Then in the future releases, we can figure out what API changes are needed(moby/swarmkit#2774 (comment)) to get these response sizes down to something normal(and possibly stream them as well).

@andrewhsu
Copy link
Member

With about 70 services with average of 20 tasks each, we're getting errors that the message size of 5MB is above the current 4MB limit, so increasing the message size limit to 32MB is a good start to buy us some breathing room until we get some more analysis on the grpc message size, tracked here.

@cpuguy83
Copy link
Member

The problem is the limit breaks existing setups and we already know that people will be broken at 32MB (or at least that is what is claimed).

I'd vote for setting back to the old value for this reason.

@cpuguy83
Copy link
Member

This should be back ported to 18.06, IHMO, to unblock people without forcing a major update.

@andrewhsu
Copy link
Member

andrewhsu commented Oct 30, 2018

I talked with @anshulpundir and @tonistiigi about this and came to consensus that bringing the max message size to the old original value of max of int32 is the best solution for this PR's desired fix. (i lost the arm-wrestling match)

I'll update backports after this PR is adjusted.

4MB client side limit was introduced in vendoring go-grpc#1165 (v1.4.0)
making these requests likely to produce errors

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
@tonistiigi
Copy link
Member Author

Updated

Copy link
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM again.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still LGTM

Copy link
Contributor

@anshulpundir anshulpundir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM again

@thaJeztah
Copy link
Member

PowerPC failure is a flaky test; #34988

00:39:03 FAIL: docker_api_swarm_test.go:359: DockerSwarmSuite.TestAPISwarmRaftQuorum
00:39:03 
00:39:03 [dda1effb20dae] waiting for daemon to start
00:39:03 [dda1effb20dae] daemon started
00:39:03 
00:39:03 [dd36cc83874c7] waiting for daemon to start
00:39:03 [dd36cc83874c7] daemon started
00:39:03 
00:39:03 [d7fc52a3c7429] waiting for daemon to start
00:39:03 [d7fc52a3c7429] daemon started
00:39:03 
00:39:03 [dd36cc83874c7] exiting daemon
00:39:03 [d7fc52a3c7429] exiting daemon
00:39:03 docker_api_swarm_test.go:385:
00:39:03     // d1 will eventually step down from leader because there is no longer an active quorum, wait for that to happen
00:39:03     waitAndAssert(c, defaultReconciliationTimeout, func(c *check.C) (interface{}, check.CommentInterface) {
00:39:03         _, err = cli.ServiceCreate(context.Background(), service.Spec, types.ServiceCreateOptions{})
00:39:03         return err.Error(), nil
00:39:03     }, checker.Contains, "Make sure more than half of the managers are online.")
00:39:03 docker_utils_test.go:435:
00:39:03     c.Assert(v, checker, args...)
00:39:03 ... obtained string = "Error response from daemon: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
00:39:03 ... substring string = "Make sure more than half of the managers are online."
00:39:03 
00:39:03 [dda1effb20dae] exiting daemon

WindowsRS5 CI is broken currently, and needs to be looked into. Opened an issue for that; #38114

23:11:20 INFO: Base image for tests is microsoft/windowsservercore
23:11:20 INFO: Loading windowsservercore .tar from disk into the daemon under test. This may take some time...
23:12:08 
23:12:08 
23:12:08 ERROR: Failed 'ERROR: Failed to load c:\baseimages\windowsservercore.tar into daemon under test' at 10/30/2018 23:12:08
23:12:08 At C:\gopath\src\github.com\docker\docker\hack\ci\windows.ps1:705 char:21
23:12:08 + ...             Throw $("ERROR: Failed to load $readBaseFrom`:\baseimages ...
23:12:08 +                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@thaJeztah thaJeztah merged commit 547f11d into moby:master Oct 31, 2018
thaJeztah added a commit to thaJeztah/docker that referenced this pull request Nov 1, 2018
Changes included;

- moby/swarmkit#2735 Assign secrets individually to each task
- moby/swarmkit#2759 Adding a new `Deallocator` component
- moby/swarmkit#2738 Add additional info for secret drivers
- moby/swarmkit#2775 Increase grpc max recv message size
  - addresses moby#37941
  - addresses moby#37997
  - follow-up to moby#38103

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
docker-jenkins pushed a commit to docker/docker-ce that referenced this pull request Nov 12, 2018
Changes included;

- moby/swarmkit#2735 Assign secrets individually to each task
- moby/swarmkit#2759 Adding a new `Deallocator` component
- moby/swarmkit#2738 Add additional info for secret drivers
- moby/swarmkit#2775 Increase grpc max recv message size
  - addresses moby/moby#37941
  - addresses moby/moby#37997
  - follow-up to moby/moby#38103

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: be3843c8c8fb30b4a604dae9d0dad3d393db717c
Component: engine
adhulipa pushed a commit to adhulipa/docker that referenced this pull request Apr 11, 2019
Changes included;

- moby/swarmkit#2735 Assign secrets individually to each task
- moby/swarmkit#2759 Adding a new `Deallocator` component
- moby/swarmkit#2738 Add additional info for secret drivers
- moby/swarmkit#2775 Increase grpc max recv message size
  - addresses moby#37941
  - addresses moby#37997
  - follow-up to moby#38103

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
thaJeztah added a commit to thaJeztah/docker that referenced this pull request Sep 12, 2019
…3 branch)

full diff: moby/swarmkit@4fb9e96...bbe3418

changes included:

- moby/swarmkit#2889 [19.03 backport] Fix update out of sequence and increase max recv gRPC message size for nodes and secrets

Which relates to

- moby#39531 integration-cli: fix swarm tests flakiness
- docker#345 [19.03 backport] integration-cli: fix swarm tests flakiness

And includes backports of

- moby/swarmkit#2808 Fix flaky tests
- moby/swarmkit#2866 Swap gometalinter for golangci-lint
- moby/swarmkit#2869 Increase max recv gRPC message size to initialize connection broker
 - related / similar to moby#38103 / docker#102 cluster: set bigger grpc limit for array requests
 - related / similar to moby#39306 Increase max recv gRPC message size for nodes and secrets
 - fixes moby/swarmkit#2733 Error generated when messages size is too big
- moby/swarmkit#2870 Fix update out of sequence

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
docker-jenkins pushed a commit to docker/docker-ce that referenced this pull request Sep 12, 2019
…3 branch)

full diff: moby/swarmkit@4fb9e96...bbe3418

changes included:

- moby/swarmkit#2889 [19.03 backport] Fix update out of sequence and increase max recv gRPC message size for nodes and secrets

Which relates to

- moby/moby#39531 integration-cli: fix swarm tests flakiness
- docker/engine#345 [19.03 backport] integration-cli: fix swarm tests flakiness

And includes backports of

- moby/swarmkit#2808 Fix flaky tests
- moby/swarmkit#2866 Swap gometalinter for golangci-lint
- moby/swarmkit#2869 Increase max recv gRPC message size to initialize connection broker
 - related / similar to moby/moby#38103 / docker/engine#102 cluster: set bigger grpc limit for array requests
 - related / similar to moby/moby#39306 Increase max recv gRPC message size for nodes and secrets
 - fixes moby/swarmkit#2733 Error generated when messages size is too big
- moby/swarmkit#2870 Fix update out of sequence

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: f7dbee3eeaa1dd218116f85b8f60361acbd5b214
Component: engine
thaJeztah added a commit to thaJeztah/docker that referenced this pull request Oct 9, 2019
…v18.09)

full diff: moby/swarmkit@142a737...5c86095

- moby/swarmkit#2892 [18.09 backport] Remove hardcoded IPAM config subnet value for ingress network
    - backport of moby/swarmkit#2890 Remove hardcoded IPAM config subnet value for ingress network
    - fixes [ENGORC-2651](https://docker.atlassian.net/browse/ENGORC-2651)
- moby/swarmkit#2836 [18.09 backport] Switch to go 1.11
    - backport of moby/swarmkit#2752 Switch to go 1.11
- moby/swarmkit#2901 [18.09 backport] Bump to golang 1.12.9
    - backport of moby/swarmkit#2880 Bump to golang 1.12.9
- moby/swarmkit#2900 [18.09 backport] Fix update out of sequence and increase max recv gRPC message size for nodes and secrets
    - backport of moby/swarmkit#2762 Increased wait time on test utils WaitForCluster and WatchTaskCreate
    - backport of moby/swarmkit#2771 Allow using Configs as CredentialSpecs
        - **second commit only** (attempt to fix weirdly broken tests)
    - backport of moby/swarmkit#2808 Fix flaky tests
    - backport of moby/swarmkit#2866 Swap gometalinter for golangci-lint
    - backport of moby/swarmkit#2869 Increase max recv gRPC message size to initialize connection broker
        - related / similar to moby#38103 / docker#102 cluster: set bigger grpc limit for array requests
        - related / similar to moby#39306 Increase max recv gRPC message size for nodes and secrets
        - fixes moby/swarmkit#2733 Error generated when messages size is too big
    - backport of moby/swarmkit#2870 Fix update out of sequence

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
docker-jenkins pushed a commit to docker/docker-ce that referenced this pull request Oct 23, 2019
…v18.09)

full diff: moby/swarmkit@142a737...5c86095

- moby/swarmkit#2892 [18.09 backport] Remove hardcoded IPAM config subnet value for ingress network
    - backport of moby/swarmkit#2890 Remove hardcoded IPAM config subnet value for ingress network
    - fixes [ENGORC-2651](https://docker.atlassian.net/browse/ENGORC-2651)
- moby/swarmkit#2836 [18.09 backport] Switch to go 1.11
    - backport of moby/swarmkit#2752 Switch to go 1.11
- moby/swarmkit#2901 [18.09 backport] Bump to golang 1.12.9
    - backport of moby/swarmkit#2880 Bump to golang 1.12.9
- moby/swarmkit#2900 [18.09 backport] Fix update out of sequence and increase max recv gRPC message size for nodes and secrets
    - backport of moby/swarmkit#2762 Increased wait time on test utils WaitForCluster and WatchTaskCreate
    - backport of moby/swarmkit#2771 Allow using Configs as CredentialSpecs
        - **second commit only** (attempt to fix weirdly broken tests)
    - backport of moby/swarmkit#2808 Fix flaky tests
    - backport of moby/swarmkit#2866 Swap gometalinter for golangci-lint
    - backport of moby/swarmkit#2869 Increase max recv gRPC message size to initialize connection broker
        - related / similar to moby/moby#38103 / docker/engine#102 cluster: set bigger grpc limit for array requests
        - related / similar to moby/moby#39306 Increase max recv gRPC message size for nodes and secrets
        - fixes moby/swarmkit#2733 Error generated when messages size is too big
    - backport of moby/swarmkit#2870 Fix update out of sequence

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: e06f07ef337ab890f211397d6b408b75a2512dc5
Component: engine
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants