Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve s3 backup store client reliability #10603

Merged
merged 3 commits into from Oct 6, 2022

Conversation

oleschoenburg
Copy link
Member

This adds the ability to set an overall api call timeout which appears to be the only thing that fixes #10547. Additionally, some other configuration for the client is updated to smooth out load spikes when taking backups and to automatically adjust various other parameters based on the detected environment.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2022

Test Results

   934 files  ±    0     934 suites  ±0   2h 4m 32s ⏱️ -7s
7 636 tests +254  7 630 ✔️ +254  6 💤 ±0  0 ±0 
7 826 runs  +254  7 818 ✔️ +254  8 💤 ±0  0 ±0 

Results for commit 5cff223. ± Comparison against base commit f6ca10e.

♻️ This comment has been updated with latest results.

@oleschoenburg oleschoenburg force-pushed the 10547-s3-backup-store-reliability branch from 45789b1 to 0284d98 Compare October 4, 2022 15:20
@@ -19,6 +20,8 @@
* try to discover an appropriate value from the environment.
* @param credentials If no value is provided, the AWS SDK will try to discover appropriate values
* from the environment.
* @param apiCallTimeout Used as the overall api call timeout for the AWS SDK. API calls that exceed

Check notice

Code scanning / CodeQL

Spurious Javadoc @param tags

@param tag "apiCallTimeout" does not match any actual type parameter of type "S3BackupConfig".
@oleschoenburg oleschoenburg force-pushed the 10547-s3-backup-store-reliability branch from 0284d98 to f7bc1cf Compare October 5, 2022 06:55
@oleschoenburg oleschoenburg marked this pull request as ready for review October 5, 2022 06:56
Copy link
Contributor

@deepthidevaki deepthidevaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Looks good to me.

❌ Please document the new parameter in config templates in dist/config

@@ -17,6 +18,7 @@ public class S3BackupStoreConfig implements ConfigurationEntry {
private String region;
private String accessKey;
private String secretKey;
private Duration apiCallTimeout;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 I suggest to provide a default value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've set it to 180 seconds. If we assume that the largest file of a backup is a 128MiB segment this requires less than 1MiB/s throughput which should be reasonable.

Copy link
Contributor

@deepthidevaki deepthidevaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@oleschoenburg
Copy link
Member Author

bors merge

zeebe-bors-camunda bot added a commit that referenced this pull request Oct 6, 2022
10593: deps(maven): bump proto-google-common-protos from 2.9.3 to 2.9.6 r=oleschoenburg a=dependabot[bot]

Bumps [proto-google-common-protos](https://github.com/googleapis/java-iam) from 2.9.3 to 2.9.6.
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a href="https://github.com/googleapis/java-iam/commits">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=com.google.api.grpc:proto-google-common-protos&package-manager=maven&previous-version=2.9.3&new-version=2.9.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- ``@dependabot` rebase` will rebase this PR
- ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it
- ``@dependabot` merge` will merge this PR after your CI passes on it
- ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it
- ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging
- ``@dependabot` reopen` will reopen this PR if it is closed
- ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>

10603: Improve s3 backup store client reliability r=oleschoenburg a=oleschoenburg

This adds the ability to set an overall api call timeout which appears to be the only thing that fixes #10547. Additionally, some other configuration for the client is updated to smooth out load spikes when taking backups and to automatically adjust various other parameters based on the detected environment.

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>
@zeebe-bors-camunda
Copy link
Contributor

Build failed (retrying...):

zeebe-bors-camunda bot added a commit that referenced this pull request Oct 6, 2022
10603: Improve s3 backup store client reliability r=oleschoenburg a=oleschoenburg

This adds the ability to set an overall api call timeout which appears to be the only thing that fixes #10547. Additionally, some other configuration for the client is updated to smooth out load spikes when taking backups and to automatically adjust various other parameters based on the detected environment.

Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>
@zeebe-bors-camunda
Copy link
Contributor

Build failed:

@oleschoenburg
Copy link
Member Author

bors retry

zeebe-bors-camunda bot added a commit that referenced this pull request Oct 6, 2022
10603: Improve s3 backup store client reliability r=oleschoenburg a=oleschoenburg

This adds the ability to set an overall api call timeout which appears to be the only thing that fixes #10547. Additionally, some other configuration for the client is updated to smooth out load spikes when taking backups and to automatically adjust various other parameters based on the detected environment.

Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>
@zeebe-bors-camunda
Copy link
Contributor

Build failed:

@oleschoenburg
Copy link
Member Author

bors retry

zeebe-bors-camunda bot added a commit that referenced this pull request Oct 6, 2022
10603: Improve s3 backup store client reliability r=oleschoenburg a=oleschoenburg

This adds the ability to set an overall api call timeout which appears to be the only thing that fixes #10547. Additionally, some other configuration for the client is updated to smooth out load spikes when taking backups and to automatically adjust various other parameters based on the detected environment.

Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>
@zeebe-bors-camunda
Copy link
Contributor

Build failed:

@oleschoenburg
Copy link
Member Author

bors retry

zeebe-bors-camunda bot added a commit that referenced this pull request Oct 6, 2022
10603: Improve s3 backup store client reliability r=oleschoenburg a=oleschoenburg

This adds the ability to set an overall api call timeout which appears to be the only thing that fixes #10547. Additionally, some other configuration for the client is updated to smooth out load spikes when taking backups and to automatically adjust various other parameters based on the detected environment.

Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>
@zeebe-bors-camunda
Copy link
Contributor

Build failed:

@oleschoenburg
Copy link
Member Author

bors merge

@zeebe-bors-camunda
Copy link
Contributor

Build succeeded:

@zeebe-bors-camunda zeebe-bors-camunda bot merged commit d8e2eaf into main Oct 6, 2022
@backport-action
Copy link
Collaborator

Successfully created backport PR #10619 for stable/8.1.

zeebe-bors-camunda bot added a commit that referenced this pull request Oct 6, 2022
10619: [Backport stable/8.1] Improve s3 backup store client reliability r=oleschoenburg a=backport-action

# Description
Backport of #10603 to `stable/8.1`.

relates to 

Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>
zeebe-bors-camunda bot added a commit that referenced this pull request Oct 7, 2022
10619: [Backport stable/8.1] Improve s3 backup store client reliability r=oleschoenburg a=backport-action

# Description
Backport of #10603 to `stable/8.1`.

relates to 

Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>
@korthout korthout added the version:8.1.1 Marks an issue as being completely or in parts released in 8.1.1 label Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
version:8.1.1 Marks an issue as being completely or in parts released in 8.1.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IN_PROGRESS backup is stuck
4 participants