Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapped tasks with operator_extra_links property object causes SerializationError #25243

Closed
1 of 2 tasks
josh-fell opened this issue Jul 22, 2022 · 9 comments
Closed
1 of 2 tasks
Labels
affected_version:2.5 Issues Reported for 2.5 area:core kind:bug This is a clearly a bug

Comments

@josh-fell
Copy link
Contributor

Apache Airflow Provider(s)

amazon

Versions of Apache Airflow Providers

apache-airflow-providers-amazon-aws==4.1.0

Apache Airflow version

main (development)

Operating System

Debian GNU/Linux 11 (bullseye)

Deployment

Other Docker-based deployment

Deployment details

Using Breeze on main branch.

What happened

Attempting to create dynamically-mapped tasks using the BatchOperator fails with the following DAG import error:

Broken DAG: [/files/dags/batchop_dtm.py] Traceback (most recent call last):
  File "/opt/airflow/airflow/serialization/serialized_objects.py", line 693, in _serialize_node
    op.operator_extra_links
  File "/opt/airflow/airflow/serialization/serialized_objects.py", line 999, in _serialize_operator_extra_links
    for operator_extra_link in operator_extra_links:
TypeError: 'property' object is not iterable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/airflow/airflow/serialization/serialized_objects.py", line 1175, in to_dict
    json_dict = {"__version": cls.SERIALIZER_VERSION, "dag": cls.serialize_dag(var)}
  File "/opt/airflow/airflow/serialization/serialized_objects.py", line 1083, in serialize_dag
    raise SerializationError(f'Failed to serialize DAG {dag.dag_id!r}: {e}')
airflow.exceptions.SerializationError: Failed to serialize DAG 'batchop_dtm': 'property' object is not iterable

What you think should happen instead

Users should be able to use Dynamic Task Mapping to generate BatchOperator tasks without a DAG import/serialization error.

How to reproduce

  1. Create a DAG similar to the following in which BatchOperator tasks are dynamically-mapped. Note this is a "toy" example, but it should be applicable to more "real-world" use cases.
from pendulum import datetime

from airflow.decorators import dag
from airflow.providers.amazon.aws.operators.batch import BatchOperator


@dag(start_date=datetime(2022, 1, 1), schedule_interval=None)
def batchop_dtm():
    BatchOperator.partial(
        task_id='submit_batch_job',
        job_queue="batch_job_queue_name",
        job_definition="batch_job_definition_name",
        overrides={},
        # Set this flag to False, so we can test the sensor below
        wait_for_completion=False,
    ).expand(job_name=["job_1", "job_2", "job_3"])


_ = batchop_dtm()
  1. Startup an Airflow environment using Breeze: breeze start-airflow
  2. The following DAG import error is generated:

image

Anything else

A similar issue was created previously with related fixes in #24676 and #25215.

I suspect the same behavior would occur using the BigQueryExecuteQueryOperator as well.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@uranusjr
Copy link
Member

Instead of going through trouble trying to accomodate dynamic extra links, perhaps we should just detect those and mark them as dynamically and can’t be shown.

@amenzel1986
Copy link

I am seeing similar issues with the ECSOperator.

@eladkal
Copy link
Contributor

eladkal commented Sep 30, 2022

Should be fixed by #25500

@josh-fell
Copy link
Contributor Author

josh-fell commented Feb 7, 2023

Reopening this issue. Trying to map BatchOperator on Airflow 2.5.1 with Amazon provider 7.1.0, yields the original import error described above.

@josh-fell
Copy link
Contributor Author

josh-fell commented Feb 7, 2023

Confirmed this works just fine on Airflow 2.5.0 Nope. Spoke too soon.

@josh-fell josh-fell added the affected_version:2.5 Issues Reported for 2.5 label Feb 7, 2023
@josh-fell josh-fell changed the title Mapped BatchOperator tasks causes SerializationError due to operator_extra_links property object Mapped tasks with operator_extra_links property object causes SerializationError Feb 7, 2023
@josh-fell josh-fell added area:core and removed provider:amazon-aws AWS/Amazon - related issues area:providers labels Feb 7, 2023
@hsilva-evisit
Copy link

hsilva-evisit commented Feb 20, 2023

I'm facing the same issue

@asherkhb
Copy link
Contributor

asherkhb commented Sep 15, 2023

Does anyone by chance know a work around for the issue (specifically w/ AWS Batch operator mentioned in original issue)?

My current "work around" is just use the Amazon provider <=v3.4.0 which isn't ideal but does allow the Batch operator to be used in dynamic task mapping...

@Taragolis
Copy link
Contributor

Taragolis commented Sep 15, 2023

@Taragolis
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affected_version:2.5 Issues Reported for 2.5 area:core kind:bug This is a clearly a bug
Projects
None yet
Development

No branches or pull requests

7 participants