Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to download using gsutil command? #1270

Open
jdominguez408 opened this issue Jul 31, 2023 · 0 comments
Open

Is it possible to download using gsutil command? #1270

jdominguez408 opened this issue Jul 31, 2023 · 0 comments

Comments

@jdominguez408
Copy link

jdominguez408 commented Jul 31, 2023

In this example (file /examples/gsutil/gsutil-example.sh):

​
# Copyright 2023 Francisco Souza. All rights reserved.
# Use of this source code is governed by a BSD-style
# license that can be found in the LICENSE file.
​
set -euo pipefail
​
bucket_name=some-bucket
project_id=test-project
here=$(cd "$(dirname "${0}")" && pwd -P)
​
# create bucket
gsutil -o "Credentials:gs_json_host=127.0.0.1" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" mb -p "${project_id}" "gs://${bucket_name}"
​
# list objects in the bucket (should be empty)
gsutil -o "Credentials:gs_json_host=127.0.0.1" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" ls -p "${project_id}" "gs://${bucket_name}"
​
# upload a couple of fileds
gsutil -o "Credentials:gs_json_host=127.0.0.1" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" cp "${here}"/hello.txt "${here}"/image.png "gs://${bucket_name}/"
​
# list objects in the bucket (should include the files that were just uploaded)
gsutil -o "Credentials:gs_json_host=127.0.0.1" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" ls -p "${project_id}" "gs://${bucket_name}" 
​

We can see all the operations with gsutil but not the download one. I've tried to execute next command like the previous examples:

gsutil -o "Credentials:gs_json_host=127.0.0.1" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" cp  "gs://${bucket_name}/${here}"/hello.txt" "${here}"/hello.txt  


But I always get the same error:

  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gsutil", line 21, in <module>
    gsutil.RunMain()
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gsutil.py", line 151, in RunMain
    sys.exit(gslib.__main__.main())
             ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 436, in main
    return _RunNamedCommandAndHandleExceptions(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 785, in _RunNamedCommandAndHandleExceptions
    _HandleUnknownFailure(e)
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 633, in _RunNamedCommandAndHandleExceptions
    return command_runner.RunNamedCommand(command_name,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/command_runner.py", line 421, in RunNamedCommand
    return_code = command_inst.RunCommand()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 1131, in RunCommand
    self.Apply(_CopyFuncWrapper,
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1575, in Apply
    self._SequentialApply(func, args_iterator, exception_handler, caller_id,
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1654, in _SequentialApply
    worker_thread.PerformTask(task, self)
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/command.py", line 2404, in PerformTask
    results = task.func(cls, task.args, thread_state=self.thread_gsutil_api)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 673, in _CopyFuncWrapper
    cls.CopyFunc(args,
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 913, in CopyFunc
    _, bytes_transferred, result_url, md5 = copy_helper.PerformCopy(
                                            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 3949, in PerformCopy
    return _DownloadObjectToFile(src_url,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 3141, in _DownloadObjectToFile
    bytes_transferred, server_encoding = _DownloadObjectToFileResumable(
                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 2948, in _DownloadObjectToFileResumable
    server_encoding = gsutil_api.GetObjectMedia(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/cloud_api_delegator.py", line 352, in GetObjectMedia
    return self._GetApi(provider).GetObjectMedia(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/gcs_json_api.py", line 1202, in GetObjectMedia
    apitools_download = apitools_transfer.Download.FromData(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/transfer.py", line 253, in FromData
    url = client.FinalizeTransferUrl(info['url'])
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/base_api.py", line 459, in FinalizeTransferUrl
    url_builder = _UrlBuilder.FromUrl(url)
                  ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/base_api.py", line 189, in FromUrl
    return cls(
           ^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/base_api.py", line 170, in __init__
    components = urllib.parse.urlsplit(_urljoin(
                                       ^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/base_api.py", line 160, in _urljoin
    new_base = base if base.endswith('/') else base + '/'
                       ^^^^^^^^^^^^^^^^^^
TypeError: endswith first arg must be bytes or a tuple of bytes, not str


The request reach fake-gcs-server in this way:

[fake-gcs-server] time="2023-08-02T11:15:55Z" level=info msg="127.0.0.1 - - [02/Aug/2023:11:15:55 +0000] \"GET /storage/v1/b/some-bucket/o/hello.txt?alt=json&fields=contentType%2Cgeneration%2Ccrc32c%2Cmd5Hash%2Cetag%2Cname%2CcustomerEncryption%2Csize%2CcontentEncoding%2CmediaLink&projection=noAcl HTTP/1.1\" 200 472"
[fake-gcs-server] time="2023-08-02T11:16:36Z" level=info msg="127.0.0.1 - - [02/Aug/2023:11:16:36 +0000] \"GET /storage/v1/b/some-bucket/o/hello.txt?alt=json&fields=contentType%2Cetag%2CcontentEncoding%2Cgeneration%2Cname%2Csize%2CcustomerEncryption%2Ccrc32c%2CmediaLink%2Cmd5Hash&projection=noAcl HTTP/1.1\" 200 472"

As I understood, to be able to download, the argument alt must have the value alt=media instead alt=json. It looks like the client, gsutil in this case, sends the request with the argument alt=json. This same command against a real bucket works perfectly. Could it be possible to make fake-gcs-server compatible with gsutil to work and being able to download with it?

@jdominguez408 jdominguez408 changed the title I need download with gsutil command Is it possible to download using gsutil command? Aug 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant