Stop jumping between Airflow, dbt Cloud, and Snowflake to debug a failed run. I built something to fix this.
Every data engineer knows the drill.
A DAG fails at 2 AM. You open Airflow — the logs say 404. You switch to dbt Cloud — the run looks fine. You query Snowflake’s QUERY_HISTORY — 11 queries, some erroring. You tab between three tools for 45 minutes and still aren’t sure which thing actually broke the pipeline.
I’ve been there too many times. So I built DataLens — a unified observability layer that pulls Airflow, dbt, and Snowflake into a single correlated timeline for every run.
Here’s what a failed datalens_lab_source_freshness run looks like in DataLens:
In one view you get:
-
Every event in order — Airflow task logs, dbt test results, Snowflake queries — merged and sorted by timestamp
-
Failed events expanded automatically — the error is front and center, not buried
-
Root cause analysis — deterministic pattern matching tells you why it failed. In this run: Airflow couldn’t reach the log server (worker 404), and DataLens caught it instantly and told me exactly what to fix
-
Lineage context — the right sidebar shows the 5 dbt nodes involved (fct_orders, stg_customers, stg_orders, customers, orders) so you can see what’s upstream of the failure without opening dbt docs
-
Credit cost per run — 0.000160 Snowflake credits, broken down per query
No more context switching. One URL, shareable with your whole team.
What’s under the hood:
-
Connects to your existing Airflow REST API, dbt Cloud, and Snowflake — no agents, no sidecars
-
Extracts
invocation_idfrom Airflow logs to correlate dbt events to the triggering DAG run -
Falls back to time-window correlation when logs aren’t available (yes, including when Airflow returns 404)
-
6 deterministic root cause patterns: warehouse unavailable, schema change, dbt test failure upstream, source freshness failure, incremental drift, empty upstream table
-
All scoped per project — no cross-tenant data leakage
I’m actively building this and want your input.
A few specific questions for the dbt community:
-
What’s your current debugging workflow when a dbt + Airflow run fails? How many tools do you touch?
-
What’s missing from the timeline view? Date range filter? dbt model drill-down? Cost-per-model breakdown?
-
Would source freshness failures and dbt test failures deserving their own dedicated views (separate from the main run timeline)?
-
What format would you want root cause explanations in? Short one-liner? Structured evidence table? Copy-paste fix command?
-
Is per-run Snowflake cost visibility useful to you, or is that a finance/analytics team concern?
Drop a reply or DM me — every piece of feedback directly shapes what gets built next. If you’re running Airflow + dbt + Snowflake and want early access to test this against your own pipelines, let me know.
LinkedIn :: https://www.linkedin.com/in/harikrishnade/
email : gharikrishnade@gmail.com
DataLens is built with FastAPI, React, PostgreSQL, and Celery. It connects to your existing stack — nothing to rip and replace.
