Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent delete raises exception but performs delete #2509

Open
echai58 opened this issue May 13, 2024 · 2 comments
Open

Concurrent delete raises exception but performs delete #2509

echai58 opened this issue May 13, 2024 · 2 comments
Labels
binding/python Issues for the Python package bug Something isn't working

Comments

@echai58
Copy link

echai58 commented May 13, 2024

Environment

Delta-rs version: 0.17.3

Binding: python


Bug

What happened:
When performing a concurrent write + delete, the delete operation raises a DeltaError: Generic DeltaTable error: Version mismatch, but the delete gets performed.

What you expected to happen:
The output should match the actual result of the operation. I'd be okay with either the concurrent delete failing with an exception, or succeeding without an exception.

How to reproduce it:

from deltalake import DeltaTable, write_deltalake
import pandas as pd

path = f"test-concurrent"

df = pd.DataFrame.from_dict(
    {
        "k": [1],
        "v": [1],
    }
)

write_deltalake(
    path,
    df,
    mode="overwrite"
)

# by getting both delta tables first, it simulates concurrent actions
table_1 = DeltaTable(path)
table_2 = DeltaTable(path)

data_1 = pd.DataFrame.from_dict(
    {
        "k": [3],
        "v": [-3],
    }
)

write_deltalake(
    path,
    data_1,
    mode="append"
)

table_2.delete("k = 1")

If you inspect the table data after the delete, you'll see the data was deleted, and the commit log includes a 002.json indicating the successful delete.

More details:

@echai58 echai58 added the bug Something isn't working label May 13, 2024
@ion-elgreco
Copy link
Collaborator

What happens if you do table_2.update_incremental() before running delete?

@echai58
Copy link
Author

echai58 commented May 13, 2024

@ion-elgreco Yeah running update_incremental before running delete allows delete the run correctly, which makes sense because it makes it no longer a concurrent operation to the append.

@rtyler rtyler added the binding/python Issues for the Python package label May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants