In Apache Airflow, XCom (cross-communication) is the built-in mechanism for tasks within the same DAG to exchange data. It is a straightforward, lightweight method ideal for sharing small bits of data.

How XCom Works

XComs store data in Airflow's metadata database, specifically within the xcom table. They are defined by four key attributes:

Using XCom in Tasks

Pushing Data

  1. Implicitly through return:
def get_count():
    return 100

count_task = PythonOperator(
    task_id='get_count',
    python_callable=get_count,
    dag=dag
)

The integer 100 is automatically pushed as an XCom with key return_value.

  1. Explicitly pushing:
def push_custom_xcom(**context):
    context['ti'].xcom_push(key='my_custom_key', value='custom_value')

custom_task = PythonOperator(
    task_id='push_custom',
    python_callable=push_custom_xcom,
    provide_context=True,
    dag=dag
)

Pulling Data

To retrieve XCom data in a downstream task:

def pull_xcom(**context):
    ti = context['ti']
    count = ti.xcom_pull(task_ids='get_count')  # pulls default 'return_value'
    custom_value = ti.xcom_pull(task_ids='push_custom', key='my_custom_key')
    print(f'Count: {count}, Custom Value: {custom_value}')

pull_task = PythonOperator(
    task_id='pull_data',
    python_callable=pull_xcom,
    provide_context=True,
    dag=dag
)

Common Use Cases

  1. Passing Identifiers: Tasks often generate IDs (like record IDs after database inserts) that are required downstream.
  2. Conditional Branching: Tasks can push boolean flags or conditions influencing the workflow branching logic.