Airflow 2.0 on docker-compose

Clement Demonchy
3 min readJan 25, 2021

--

So you heard of the brand new Airflow 2.0 and want to run it on your own machine to give it a go.

Sadly, the getting started isn’t nearly ready for an integration environment as it neglects most of the interesting components and features like multiple workers to distribute tasks.

Don’t worry I have you covered.

In this article I will show you a dummy project with:

  • Everything running under docker compose using official Airflow docker image
  • Separate container for Airflow scheduler and webserver
  • An Postgresql backend for Airflow operation that allow to run task in parallel
  • Multiple workers run with Celery
  • Flower as Celery’s GUI
  • Home-made dags instead of Airflow examples

Here you can see the full infra we want to create from our project repository https://github.com/knil-sama/itm.

You can then acces the Airflow GUI

First thing you want to do once everything is up and running, is to login into airflow webserver with user “admin” and password “test”.

Then we can activate our dag “main_dag” that will then be picked up by our scheduler every 5 min (or you can click on the play icon to trigger it by hand).

Here for more details regarding DAG run.

Another interface you can interact with is the Flower one that provides a monitoring of your celery workers.

If we take a look at the docker-compose we can see the dependencies between each machines and starting order

webserver -> scheduler -> woker then flower

Each machine started with a dedicated airflow command, except the webserver where we do two operations before starting it:

  1. Initializing postgresql backend db (to simplify the demo here we don’t keep persistent storage and reinitialize each run in order to avoid conflict when doing init on an already existing backend)
  2. Create an user admin because airflow don’t provide one by default only roles, you can also allow user to self register

Another thing to consider is that we share DAGs among the following machines so we rely on xcom to rely simple variable and store image in mongodb, because we don’t have a shared filesystem

To avoid duplicated configuration we create single env file env_airflow.env

Here some details of the configuration that are needed

After that you are all set to experiment yourself with Airflow

--

--