Data Pipelines with Apache Airflow

出版刊物 盘天下 2025-01-11 1087 0

作者简介

Bas Harenslak and Julian de Ruiter are data engineers with extensive experience using Airflow to develop pipelines for major companies including Heineken, Unilever, and Booking.com. Bas is a committer, and both Bas and Julian are active contributors to Apache Airflow.

Data Pipelines with Apache Airflow

内容简介

Data Pipelines with Apache Airflow is your essential guide to working with the powerful Apache Airflow pipeline manager . Expert data engineers Bas Harenslak and Julian de Ruiter take you through best practices for creating pipelines for multiple tasks, including data lakes, cloud deployments, and data science. Part desktop reference, part hands-on tutorial, this book teaches you the ins-and-outs of the Directed Acyclic Graphs (DAGs) that power Airflow, and how to write your own DAGs to meet the needs of your projects. You’ll learn how to automate moving and transforming data, managing pipelines by backfilling historical tasks, developing custom components for your specific systems, and setting up Airflow in production environments. With complete coverage of both foundational and lesser-known features, when you’re done you’ll be set to start using Airflow for seamless data pipeline development and management.

what's inside

Framework foundation and best practices

Airflow's execution and dependency system

Testing Airflow DAGs

Running Airflow in production

Bas Harenslak and Julian de Ruiter are data engineers with extensive experience using Airflow to develop pipelines for major companies including Heineken, Unilever, and Booking.com. Bas is a committer, and both Bas and Julian are active contributors to Apache Airflow.

下载地址

夸克资源精选合集

(1)
(0)

评论列表