Course Outline
Introduction to Apache Airflow
- Understanding workflow orchestration.
- Key features and benefits of Apache Airflow.
- Airflow 2.x improvements and ecosystem overview.
Architecture and Core Concepts
- Scheduler, web server, and worker processes.
- DAGs, tasks, and operators.
- Executors and backends (Local, Celery, Kubernetes).
Installation and Setup
- Installing Airflow in local and cloud environments.
- Configuring Airflow with different executors.
- Setting up metadata databases and connections.
Navigating the Airflow UI and CLI
- Exploring the Airflow web interface.
- Monitoring DAG runs, tasks, and logs.
- Using the Airflow CLI for administration.
Authoring and Managing DAGs
- Creating DAGs with the TaskFlow API.
- Using operators, sensors, and hooks.
- Managing dependencies and scheduling intervals.
Integrating Airflow with Data and Cloud Services
- Connecting to databases, APIs, and message queues.
- Running ETL pipelines with Airflow.
- Cloud integrations: AWS, GCP, Azure operators.
Monitoring and Observability
- Task logs and real-time monitoring.
- Metrics with Prometheus and Grafana.
- Alerting and notifications via email or Slack.
Securing Apache Airflow
- Role-based access control (RBAC).
- Authentication with LDAP, OAuth, and SSO.
- Secrets management with Vault and cloud secret stores.
Scaling Apache Airflow
- Parallelism, concurrency, and task queues.
- Using CeleryExecutor and KubernetesExecutor.
- Deploying Airflow on Kubernetes with Helm.
Best Practices for Production
- Version control and CI/CD for DAGs.
- Testing and debugging DAGs.
- Maintaining reliability and performance at scale.
Troubleshooting and Optimization
- Debugging failed DAGs and tasks.
- Optimizing DAG performance.
- Common pitfalls and how to avoid them.
Summary and Next Steps
Requirements
- Experience with Python programming.
- Familiarity with data engineering or DevOps concepts.
- Understanding of ETL or workflow orchestration.
Audience
- Data scientists.
- Data engineers.
- DevOps and infrastructure engineers.
- Software developers.
Testimonials (7)
The instructor adapted the training to the participants’ level and responded to all questions. He was very communicative, and it was easy to interact with him. I really appreciated the format of the training, which included many practical exercises. Overall, it was a very engaging and well-organized session.
Jacek Chlopik - ZAKLAD UBEZPIECZEN SPOLECZNYCH
Course - Apache Airflow: Building and Managing Data Pipelines
The training was spot on. Very useful theory and exercices.
Vladimir - PUBLIC COURSE
Course - Apache Airflow
The training was spot on in all aspects. Usefull theoretical aspects and exercises.
Vladimir - PUBLIC COURSE
Course - Apache Airflow
The training was spot on in all aspects. Usefull theoretical aspects and exercises.
Vladimir - PUBLIC COURSE
Course - Apache Airflow
The training was spot on in all aspects. Usefull theoretical aspects and exercises.
Vladimir - PUBLIC COURSE
Course - Apache Airflow
The training was spot on in all aspects. Usefull theoretical aspects and exercises.
Vladimir - PUBLIC COURSE
Course - Apache Airflow
The training was spot on in all aspects. Usefull theoretical aspects and exercises.