Overview
I am currently having an airflow deployment on K8s that is coupled with DBT. The deployments folder structure looks something like:
├── Dockerfile
├── dags
│ ├── dbt_hello_world.py
├── dbt
├── requirements.txt
Currently the defined dag dbt_hello_world.py
is reading the model definitions (sql files) from the dbt
folder. So the dags need to have access to the dbt folder structure.
What is happening now
Currently the dags are built in to the Docker container. Which means every single time someone wants to change an SQL file, they have to build and push a new Docker container and start a CI/CD Pipeline.
What I want to do
I want to decouple the infrastructure (container deployment) from the application (dags + dbt files). I know that it is possible to sync the dags via git (explained here). In other words, let airflow continuously pull the latest dags from git. So I would like to use that feature, and set up the dags in a separate repo.
Where I am stuck
To my understanding I can’t place non-dag files inside the dags folder. If I can sync the airflow dags, how do I sync the files which airflow needs to have access to (precisely the DBT folder)?