I am experimenting on running DBT with Airflow. So far I have managed to setup both tools but in Docker Compose that uses the localExecutor from Airflow and runs models using “dbt run --models …”. In order to design the different DAGs I am using DBT tags to try to organise/filter models. In order to build the models’ dependencies and identify the tags, I am parsing the manifest.json file after it has been compiled.
I am wondering:
- Do people follow alternative ways with regards to Airflow?
- Any issues people have faced with complicated models and how to run them in Airflow?
The major deficiency that I have so far in my design is that it does not utilise the raw SQL from manifest.json. This will be my next step. But other than that, I would like some feedback on how this project be improved. Probably in a production Airflow environment, thing would be slightly different.
The repository is at https://github.com/konosp/dbt-airflow-docker-compose and my general post about it is at https://analyticsmayhem.com/dbt/apache-airflow-dbt-docker-compose/.
You can clone and run if you wish using the Kaggle dataset mentioned in the instructions (not all sample data can be uploaded in GitHub).
Any feedback is welcomed!