Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi Asset Check unable to access the Asset's output #21772

Open
KarMishRMS opened this issue May 10, 2024 · 0 comments
Open

Multi Asset Check unable to access the Asset's output #21772

KarMishRMS opened this issue May 10, 2024 · 0 comments
Assignees
Labels
area: asset-checks Related to Asset Checks type: bug Something isn't working

Comments

@KarMishRMS
Copy link

KarMishRMS commented May 10, 2024

Dagster version

1.7.3

What's the issue?

Hi, I am using the multi_asset_check function, my goal is to apply multiple checks on an asset. Earlier I was using asset_check function and I was able to access the asset's output and apply the check onto it easily. (FYI: my asset output is a pandas dataframe). Now with multi_asset_check I am unable to access the Asset's output, I do not know why I can retrieve the output when working with asset_check but now with multi_asset_check. Is there some different syntax or something I am missing?
Please have a look at the code below, in the combined_check function I want to access the output of the asset in the df variable. earlier I just used context_df and it worked fine, now I am stuck with this.
I tried looking in documentation and other issues but its really not helpful.

Below is my code for reference -

@asset(group_name="LAMBDA")
def lambda_feed_asset() -> Output[DataFrame]:
   df = runner.download()
    output_data = {
        "df ": df 
    }
    return Output(df, metadata={
                    "result df": MetadataValue.md(output_data['df'].head().to_markdown() })


@multi_asset_check(
    specs=[
        AssetCheckSpec("speed check", asset=lambda_feed_asset),
        AssetCheckSpec("null_check", asset=lambda_feed_asset),
    ], 
)

def combined_check():
    df = asset ## STUCK HERE - How to Access Asset's output so that I can pass it here? 
    checkcols = ['station','date','time']
    speed_check_df = df[
        (df['speed'] < 0) | 
        (df['speed'] < 0) | 
        (df['speed'] > 100000) | 
        (df['speed'] > 100000)
    ]
    null_check_df = df[df[checkcols].isnull()]

    results = [
        AssetCheckResult(
            passed=len(speed_check_df) == 0, 
            metadata={
                "speed check df": MetadataValue.md(speed_check_df.head().to_markdown())
            }
        ),
        AssetCheckResult(
            passed=len(null_check_df) == 0, 
            metadata={
                "num_rows_null": len(null_check_df),
                "null count df": MetadataValue.md(null_check_df.head().to_markdown())
            }
        )
    ]
    return results

What did you expect to happen?

No response

How to reproduce?

No response

Deployment type

Local

Deployment details

No response

Additional information

No response

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

@KarMishRMS KarMishRMS added the type: bug Something isn't working label May 10, 2024
@garethbrickman garethbrickman added the area: asset-checks Related to Asset Checks label May 10, 2024
@johannkm johannkm self-assigned this May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: asset-checks Related to Asset Checks type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants