Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 upload_file, download file to support path-lib objects #2259

Merged
merged 31 commits into from Dec 16, 2022
Merged

S3 upload_file, download file to support path-lib objects #2259

merged 31 commits into from Dec 16, 2022

Conversation

alanyee
Copy link
Contributor

@alanyee alanyee commented Jan 8, 2020

S3 upload_file, download file to support pathlib.Path

Add unit tests

#2206

S3 upload_file, download file to support path-lib objects
@alanyee alanyee changed the title Update transfer.py S3 upload_file, download file to support path-lib objects Jan 8, 2020
fspath requires Python 3.6 or higher
@github-actions
Copy link

github-actions bot commented Jan 8, 2021

Greetings! It looks like this issue hasn’t been active in longer than one year. We encourage you to check if this is still an issue in the latest release. Because it has been longer than one year since the last update on this, and in the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment to prevent automatic closure, or if the issue is already closed, please feel free to reopen it.

@github-actions github-actions bot added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Jan 8, 2021
@alanyee
Copy link
Contributor Author

alanyee commented Jan 8, 2021

The issue still has not been resolved, and #2206 should be reopened. This PR should remain open.

@github-actions github-actions bot removed the closing-soon This issue will automatically close in 4 days unless further comments are made. label Jan 8, 2021
@swetashre swetashre added the pr/needs-review This PR needs a review from a member of the team. label Jan 14, 2021
@jonemo
Copy link
Contributor

jonemo commented Nov 23, 2022

Hi @alanyee, thank you for creating this PR and apologies for the very late response. Because boto3 now requires Python >=3.7, the risk of adversely affecting existing behavior is much smaller and the additional code for compatibility with versions that do not have os.fspath is no longer needed.

The next step for this PR is to remove the logic for supporting older Python versions.

@alanyee
Copy link
Contributor Author

alanyee commented Dec 4, 2022

@jonemo I have removed the logic for supporting older Python versions.

Copy link
Contributor

@jonemo jonemo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this branch out on Linux, Mac, and Windows today, looking for ways to make it fail. So far so good, the only caveat is that it currently only works with pathlib.Path objects and not other path-like objects (see inline comment).

Additional test coverage is needed:

  1. Unit tests should cover more variations on how path-like objects can be used (see inline comment)
  2. At least one integration test should also use a path-like object. I'd say it's ok to change one of the existing upload_file or download_file integration tests to use Path instead of adding a new one.

Please let me know if you want to hand any part of the remaining work off to me.

tests/unit/s3/test_transfer.py Outdated Show resolved Hide resolved
boto3/s3/transfer.py Outdated Show resolved Hide resolved

extra_args = {'ACL': 'public-read'}
self.transfer.upload_file(
Path('smallfile'), 'bucket', 'key', extra_args=extra_args
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some test coverage for path-like objects that are not pathlib.Path. pathlib.PurePath might be a suitable option.

Considering how paths have a habit of causing trouble, I am also wondering if there are any other edge cases that we should cover here. Are you confident that things like the following will work correctly: relative paths, paths that contain .., Windows paths with drive letters, paths that contain symlinks.

What about error cases. Is the behavior the same for a path-like object that doesn't exist and a string representation of a path that doesn't exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simplify the basis by returning the fspath of the Path-like object which is a string. If there is an issue with the string, it would occur with or without this feature given that the original functionality only accepts a string. If the user messes up their Path-like object's path somehow, I think that's up to them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that with your current code there is no difference between integration-testing with a path-like object and integration testing with a string. However, one purpose of tests is to have them fail when a future code contributor changes the implementation in a way that you did not expect. That would be the purpose of having one integration test that uses a path-like object.

Copy link
Contributor Author

@alanyee alanyee Dec 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add those as additional unit tests for path-like objects, but I will also add unit tests with those edge cases as regular strings as well.

Also, I already added an integration testing for a path-like object.

@alanyee alanyee requested a review from jonemo December 9, 2022 04:52
@alanyee
Copy link
Contributor Author

alanyee commented Dec 9, 2022

@jonemo your thoughts on the current test coverage?

@jonemo
Copy link
Contributor

jonemo commented Dec 13, 2022

Thank you for adding additional test for non-Path path-like objects. I just checked the Python docs again and found these two statements:

class pathlib.PurePath(*pathsegments)
A generic class that represents the system’s path flavour (instantiating it creates either a PurePosixPath or a PureWindowsPath)

and

class pathlib.Path(*pathsegments)
A subclass of PurePath, this class represents concrete paths of the system’s path flavour (instantiating it creates either a PosixPath or a WindowsPath)

So there is no need to write separate tests for PurePath, PurePosixPath, and PureWindowsPath. A single test with PurePath will test PurePosixPath on Posix systems and PureWindowsPath on Windows systems.

self.manager.download.assert_called_with(
'bucket',
'key',
os.path.normpath('/tmp/smallfile'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is os.path.normpath needed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tested it now, but the keys can be messed up on Windows:

Expected: download('bucket', 'key', '/tmp/smallfile', {'SSECustomerKey': 'foo', 'SSECustomerAlgorithm': 'AES256'}, None)
E           Actual: download('bucket', 'key', '\\tmp\\smallfile'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to use an absolute path, make sure the first character is not a slash ('/') Use pathlib or the os.path.normpath() function to ensure that paths are properly encoded for Win32.

I will add normpath to ensure it is properly encoded independent of the system.

@jonemo jonemo requested review from jonathan343 and jonemo and removed request for jonemo December 16, 2022 00:45
Copy link
Contributor

@jonathan343 jonathan343 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@jonemo jonemo merged commit 3442539 into boto:develop Dec 16, 2022
@jonemo
Copy link
Contributor

jonemo commented Dec 16, 2022

In lieu of another round of review comments, I added a commit with a minor change to the tests, changelog, and merged master. Thank you again @alanyee for the work on this (and the patience). And thanks to @jonathan343 for the final review!

aws-sdk-python-automation added a commit that referenced this pull request Dec 16, 2022
* release-1.26.32:
  Bumping version to 1.26.32
  Add changelog entries from botocore
  S3 upload_file, download file to support path-lib objects (#2259)
@alanyee alanyee deleted the patch-1 branch December 19, 2022 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-review pr/needs-review This PR needs a review from a member of the team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants