Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TM DownloadFile Pause and Resume] Part 1: Add configuration to enable overwriting existing files #3125

Merged
merged 4 commits into from Mar 29, 2022

Conversation

zoewangg
Copy link
Contributor

@zoewangg zoewangg commented Mar 24, 2022

Motivation and Context

#3108
#2300

Modifications

  • Expose an option to overwrite an existing file in FileAsyncResponseTransformer.

  • Note that it's not added to FileResponseTransformer right now because it's not trivial to do so. I will create a backlog item for it.

Testing

Added unit tests and integ tests

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read the CONTRIBUTING document
  • Local run of mvn install succeeds
  • My code follows the code style of this project
  • My change requires a change to the Javadoc documentation
  • I have updated the Javadoc documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed
  • I have added a changelog entry. Adding a new entry must be accomplished by running the scripts/new-change script and following the instructions. Commit the new file created by the script in .changes/next-release with your changes.
  • My change is to implement 1.11 parity feature and I have updated LaunchChangelog

License

  • I confirm that this pull request can be released under the Apache 2 license

@zoewangg zoewangg requested a review from a team as a code owner March 24, 2022 23:17
.changes/next-release/feature-AWSSDKforJavav2-1ee4d61.json Outdated Show resolved Hide resolved
Comment on lines 70 to 78
private long determineFilePositionToWrite(Path path) {
if (fileModificationOption == CREATE_OR_APPEND_EXISTING) {
try {
return Files.size(path);
} catch (NoSuchFileException e) {
// Ignore
} catch (IOException exception) {
throw SdkClientException.create("Cannot determine the current file size " + path, exception);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an expensive thing to do all the time (e.g. if the file is always being created for the first time). Is there any way we can defer this until we need it or use a solution that doesn't rely on catching NoSuchFileException?

Copy link
Contributor Author

@zoewangg zoewangg Mar 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File position has to be determined in the ctor because otherwise retry would not work for appending in the case where certain bytes have been written. We could callFiles.notExists, but it also relies on catching NoSuchFileException behind the scene.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm couldn't we do it at the time we open the file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No not really, we open the file in onStream method, so the position we get at this point may not be correct if the previous attempt had written a few bytes already before it failed.

@zoewangg zoewangg changed the base branch from master to zoewang/tm-download-pause-resume March 25, 2022 19:01
@zoewangg
Copy link
Contributor Author

zoewangg commented Mar 25, 2022

Changed the target branch to zoewang/tm-download-pause-resume to make sure it works with TM download pause/resume before we merge it to mainline.

I'm still updating the PR

@zoewangg zoewangg changed the title Add configuration to enable overwriting existing files [TM DownloadFile Pause and Resume] Part 1: Add configuration to enable overwriting existing files Mar 29, 2022
@sonarcloud
Copy link

sonarcloud bot commented Mar 29, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 11 Code Smells

88.4% 88.4% Coverage
0.0% 0.0% Duplication

*/
public static FileTransformerConfiguration defaultCreateOrAppend() {
return builder().fileWriteOption(FileWriteOption.CREATE_OR_APPEND_EXISTING)
.failureBehavior(FailureBehavior.LEAVE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason for this default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, retry would not work as expected if we delete the file when there is an exception.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it also might not work as expected if we started appending, failed and then on a retry reappended the same data.

Copy link
Contributor Author

@zoewangg zoewangg Mar 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial position is recorded in the ctor, so we will always append on the same offset on retries (overwrite whatever bytes that have been written in the previous attempt)

Comment on lines +68 to +77
if (configuration.fileWriteOption() == CREATE_OR_APPEND_EXISTING) {
try {
return Files.size(path);
} catch (NoSuchFileException e) {
// Ignore
} catch (IOException exception) {
throw SdkClientException.create("Cannot determine the current file size " + path, exception);
}
}
return 0L;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a backlog item to revisit this? It makes me nervous to rely on catching an exception on a possibly hot code path.

Copy link
Contributor Author

@zoewangg zoewangg Mar 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could call Files#exists first but it uses the same try-catch though... I'm not sure how we can optimize it, but yeah, I can add a TODO

@zoewangg zoewangg merged commit eade04e into zoewang/tm-download-pause-resume Mar 29, 2022
@zoewangg zoewangg deleted the zoewang/fileOverwrite branch March 29, 2022 23:49
zoewangg added a commit that referenced this pull request Apr 27, 2022
* [TM DownloadFile Pause and Resume] Part 1: Add configuration to enable overwriting existing files (#3125)

* Expose an option to overwrite an existing file in FileAsyncResponseTransformer

* Add changelog entries and make TM use CREATE_OR_REPLACE_EXISTING write option by default

* Address feedback

* Update and address feedback

* [TM DownloadFile Pause and Resume] Part 2: Implement pause for downloadFile operation (#3094)

* Part 1: Implement pause for downloadFile operation

* Address feedback

* Refactor the logic

* Address feedback

* Fix merging issue

* [TM DownloadFile Pause and Resume] Part 3: Implement resumeDownloadFile (#3157)

* Implement resumeDownloadFile

* Move test code around

* Address feedback

* Fix flaky test

* fix flaky integ test

* add changelog entry

* Troubleshooting flaky test

* Add file length check when checking if file has modified or not

* Address feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants