Hi,
I would like to run dbt using the GKEStartPodOperator Airflow Operator but I am struggling to find the proper way to authenticate dbt so that it can perform operations on Google Cloud BigQuery.
So here’s my profile in profiles.yaml
file:
my-profile:
target: prod
outputs:
dev:
type: bigquery
method: oauth
project: my-gcp-project
dataset: production
location: EU
threads: 6
job_execution_timeout_seconds: 28800 # 8 hours
priority: "{{ env_var('BQ_PRIORITY', 'interactive') }}"
retries: 3
prod:
type: bigquery
method: oauth
project: my-gcp-project
dataset: production
location: EU
threads: 6
job_execution_timeout_seconds: 28800 # 8 hours
retries: 3
And here’s my GKEStartPodOperator configuration:
def default_k8s_args(dag, default_args, name="no-name"):
return {
"dag": dag,
"default_args": default_args,
"execution_timeout": timedelta(hours=12),
"project_id": K8S_PROJECT_ID,
"location": K8S_LOCATION,
"cluster_name": K8S_CLUSTER_NAME,
"name": f"dbt-{dag.dag_id}-{name}",
"namespace": GCP_PROJECT,
"is_delete_operator_pod": True,
"container_resources": COMPUTE_RESOURCES,
"startup_timeout_seconds": 600,
"image": f"{DBT_IMAGE_REPO}:{DBT_IMAGE_TAG}",
"image_pull_policy": "Always",
"env_vars": {"SLACK_BOT_TOKEN": SLACK_CONN_ID, "PYTHONUNBUFFERED": "1"},
"gcp_conn_id": GCP_PROJECT,
}
dbt_pre_build_tests = GKEStartPodOperator(
task_id="pre_build_tests",
cmds=[
"/bin/bash",
"-c",
"dbt test --target prod --select assert_pks",
],
**default_k8s_args(dag, default_args, name="pre_build_tests"),
)
The pod launches successfully but then I’m getting the following error:
[2024-12-05, 14:39:08 UTC] {pod_manager.py:356} INFO - e[0m14:39:08 Found 86 models, 16 tests, 3 snapshots, 0 analyses, 469 macros, 2 operations, 0 seed files, 91 sources, 0 exposures, 0 metrics
[2024-12-05, 14:39:08 UTC] {pod_manager.py:356} INFO - e[0m14:39:08
[2024-12-05, 14:39:08 UTC] {pod_manager.py:356} INFO - e[0m14:39:08 Encountered an error:
[2024-12-05, 14:39:08 UTC] {pod_manager.py:356} INFO - Database Error
[2024-12-05, 14:39:08 UTC] {pod_manager.py:356} INFO - [Errno 2] No such file or directory: '/etc/secrets/[GCP_PROJECT]/credentials.json'
[2024-12-05, 14:39:10 UTC] {pod_manager.py:424} ERROR - Error parsing timestamp (no timestamp in message ''). Will continue execution but won't update timestamp
[2024-12-05, 14:39:10 UTC] {pod_manager.py:356} INFO -
[2024-12-05, 14:39:10 UTC] {pod_manager.py:383} WARNING - Pod dbt-data-ext-omnitracking-morning-job-py-pre-build-tests-gbkd7d9v log read interrupted but container base still running
Can someone help me, please?
Thanks in advance.