Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not working in AWS Lambda (0.16.2 - 0.17.4) OSError: Generic S3 error #2511

Open
TheKnightCoder opened this issue May 14, 2024 · 0 comments
Open
Labels
bug Something isn't working storage/aws AWS S3 storage related

Comments

@TheKnightCoder
Copy link

TheKnightCoder commented May 14, 2024

Environment

Delta-rs version: 0.16.2 - 0.17.4

Binding: Python

Environment:

  • Cloud provider: AWS Lambda
  • OS: Python3.12
  • Other: Python3.12 + AWS SDK for Pandas Layer + Custom layer with Deltalake 0.17.4 and pyarrow_hotfix 0.6

Bug

What happened: Code is working in version 0.15.3
code snippet:

table_path = f's3://{BUCKET_NAME}/bronze/test'

def insert():
    print('writting')
    data = {"id": [1,2], "b": [2, 2]}
    df = pd.DataFrame(data)
    print(df)
    write_deltalake(table_path, df, mode='overwrite', overwrite_schema=True)
    print('written')

insert()

What you expected to happen:
Attempting to write_deltalake or any other operation will throw this error:

[ERROR] OSError: Generic S3 error: Error after 10 retries in 2.612435942s, max_retries:10, retry_timeout:180s, source:error sending request for url (https://s3.us-east-1.amazonaws.com/bucketname-bmdcgdma/bronze/test/_delta_log/_last_checkpoint): error trying to connect: invalid peer certificate: BadSignature
Traceback (most recent call last):
  File "/var/task/events/s3-update-nfts.py", line 127, in handler
    insert()
  File "/var/task/events/s3-update-nfts.py", line 35, in insert
    write_deltalake(table_path, df, mode='overwrite', overwrite_schema=True, storage_options=storage_options)
  File "/opt/python/deltalake/writer.py", line 265, in write_deltalake
    table, table_uri = try_get_table_and_table_uri(table_or_uri, storage_options)
  File "/opt/python/deltalake/writer.py", line 688, in try_get_table_and_table_uri
    table = try_get_deltatable(table_or_uri, storage_options)
  File "/opt/python/deltalake/writer.py", line 701, in try_get_deltatable
    return DeltaTable(table_uri, storage_options=storage_options)
  File "/opt/python/deltalake/table.py", line 405, in __init__
    self._table = RawDeltaTable(

How to reproduce it:

deltalake==0.17.4
pyarrow_hotfix==0.6

build.sh

mkdir -p ./dist/python
pip install -r requirements.txt -t ./dist/python --no-deps --platform manylinux2014_aarch64 

following this article https://delta.io/blog/2023-04-06-deltalake-aws-lambda-wrangler-pandas/
added pyarrow_hotfix as its a required dependency not available in aws sdk for pandas layer

  • Add all s3 permissions to the aws lambda execution layer
  • set env var AWS_S3_ALLOW_UNSAFE_RENAME: 'true'
  • Use the layer to write delta table in aws lambda throw errors

More details:
Last working version 0.15.3 but not working from 0.16.2 - 0.17.4

Edit: 0.16.1 is working too, bug introduced int 0.16.2

@TheKnightCoder TheKnightCoder added the bug Something isn't working label May 14, 2024
@TheKnightCoder TheKnightCoder changed the title Not working in AWS Lambda (0.17.4) OSError: Generic S3 error Not working in AWS Lambda (0.16.4 - 0.17.4) OSError: Generic S3 error May 14, 2024
@TheKnightCoder TheKnightCoder changed the title Not working in AWS Lambda (0.16.4 - 0.17.4) OSError: Generic S3 error Not working in AWS Lambda (0.16.2 - 0.17.4) OSError: Generic S3 error May 14, 2024
@rtyler rtyler added the storage/aws AWS S3 storage related label May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working storage/aws AWS S3 storage related
Projects
None yet
Development

No branches or pull requests

2 participants