I have a pre hook and post hook in a model. The pre hook inserts audit information into an admin table. The post hook is intended to update that table when the model succeeds/fails. The post hook works fine when the model succeeds. But if the model fails, the post hook does not get executed.
update_status.sql macro looks like below!
I cannot provide a complete working example, but I think the best approach would be to use DRY principle and parse run results with an on-run-end macro once than using a post-hook after each model.
If you’re trying to track the results of your dbt invocations, you could also look at the dbt_artifacts package, which will deal with a lot of this for you
Thanks for the reply! I tried implementing this with some additions, like, adding compile start and end time, execute start and end time. Now I am exploring more about run_results and other artifacts. One thing I would like to explore is tracking the model runs as and when they run. “on-run-end” uploads the results only after the run completes. To track the execution as it progresses, I have implemented pre-hooks and make an entry into a different table.
If the model succeeds, I updated that table’s entry with end time. Otherwise, the end time will be null. Using this, we can track if some model is running for longer times.
I think you mix too many things. You should separate the logic if you want a scalable and easily maintainable solution.
dbt should run the models because of its primary purpose. For monitoring, you should implement something that doesn’t depend on the models themselves.
Package dbt_artifacts collects dbt run results and stores them for future analysis. You don’t need to analyze dbt on the run for optimizations. If you want to control the duration of the models, you could add a timeout parameter to the profiles.yml file to fail long-running models after a specific number of seconds.
However, if you want to track dbt progress, it depends on how you are running dbt. Either you check CLI output, where you can see the model status and how many models are left, or you could run each dbt model as a separate task, for example, in Airflow, and check the status and duration of each task.
Checking CLI output is a manual task, but you could separate monitoring and dbt processes with Airflow
I am not very familiar with dbt Cloud, but maybe @joellabes could explain if dbt Cloud has some kind of solution.
I agree that you should use the artifacts package to understand your models’ behaviour as opposed to tracking their behaviour in real time. post hooks only run for successful models, whereas on-run-end hooks always run. Additionally, all this secondary logging can drastically slow down your project, making your timing endeavours somewhat counterproductive!