I was searching for a way to prevent the user to run the whole project accidentally. For example, for very large projects, if someone accidentally does a dbt run dbt can try to run thousands of models and it can be very costly.
Of course, you can interrupt it with ctrl-c, but I wanted something safer (the user can just run the command and go drink a coffee), so I created this very simple macro, that can be used with the on-run-start hook.
macros/check_select_arg.sql
{% macro check_select_arg() %}
{% if not invocation_args_dict.select and target.name != 'prod' %}
{{ exceptions.raise_compiler_error("Error: You must provide at least one select argument") }}
{% endif %}
{% endmacro %}
dbt_project.yml
on-run-start:
- "{{ check_select_arg() }}"
It checks if the user has invoked some command with the select argument, and if they donāt, an error will be raised.
You can allow some specific targets to run commands without this restriction, to configure production jobs easier, for example.
You can customize this macro the way it makes more sense to you.
I donāt know why itās not triggered at all, even the log message:
āÆ dbt run
21:00:50 Running with dbt=1.4.5
21:00:50 [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 2 unused configuration paths:
- models.yyy
- models.xxx
21:00:50 Found 161 models, 132 tests, 3 snapshots, 0 analyses, 560 macros, 0 operations,
18 seed files, 73 sources, 9 exposures, 0 metrics
21:00:51
21:00:52 Concurrency: 4 threads (target='dev')
21:00:52
21:00:52 1 of 161 START sql table model [Cant add more info, enterprise project]
And Iāve renamed the macro to add an āsā both on the file name (useless) and on the macro name (required).
Iāll sleep thinking on that and maybe tomorrow Iāll have the answer
Funny thing: just upgraded to the latest 1.5 (1.5.4) and ā¦ it works. Mehā¦ I donāt like when I donāt understand.
Though, the latest 1.6 produce an error, so I canāt upgrade atm => ācannot import name āPOLLING_PREDICATEā from āgoogle.api_core.future.pollingāā
But this is another issue for another post ! Thanks for your help Bruno, and honored to meet you (even if itās virtual) : Iām an happy reader of your tips on LinkedIn
This was very helpful. We also noticed that dbt compile calls dbt run. Usually compile is a cheap operation that can be used to test massive refactors. This would prevent project wide compile.
If you want this check to bypass compile and only apply for dbt run and dbt build you can do this,
{% macro check_select_args() %}
{% if not invocation_args_dict.select and (target.name != 'prod') and (invocation_args_dict.which in ['run', 'build'])%}
{{ exceptions.raise_compiler_error("run/build should have select argument. ") }}
{% endif %}
{% endmacro %}
Awesome addition to the macro @adiamaan92 !! Loved that
This way the admin can choose which commands will follow this rule. I would add test, source and snapshot, and we can also make a list for the target names
{% macro check_select_arg() %}
{% if
not invocation_args_dict.select
and target.name not in ['prod']
and invocation_args_dict.which in ['build', 'run', 'test', 'source', 'snapshot']
%}
{{ exceptions.raise_compiler_error("Error: You must provide at least one select argument") }}
{% endif %}
{% endmacro %}
Iām using a variation of your macro with the Bigquery, and interrupting does not stop the adapter from running the pipeline. Is there another exception or alternate action I can take to cancel all jobs?
There are related discussions to get the BQ adapter and dbt-core to cancel all queries here. Thank you!
UPDATE/EDIT - getting strange behavior from this macro - the first time I ran it, it raised the exception but it seems the jobs were already sent to BQ and they completed. The second and subsequent times a compilation error was raised and nothing was executed. The only difference was I dropped the max threads count in my ~/.dbt/profiles.yml to be lower than the number of models that would have been run on the second and subsequent runs.
Second Update - when trying to generate a model from a source file in VSCode using dbt power user automatically, the macro actually prevents the model from being generated.
Itās interesting because on my side, iām running dbt core (version 1.7.13) and that macro only works when the target folder is empty. As soon as I run another dbt command to run a specific model, and try to execute a dbt run again, it doesnāt give the compilation error anymore.
This will prevent most things to build with the exception of seeds and models that reference those seeds, but you can do the same thing by creating and editing the ref() macro in your project and then completely block a dbt run or dbt build without selectors.
Hi, I am trying to this samething in dbt cloud but my macro is getting undefined error.
Command failed
Compilation Error in operation udp_gscda_distribution-on-run-start-0 (./dbt_project.yml)
ācheck_argsā is undefined. This can happen when calling a macro that does not exist. Check for typos and/or install package dependencies with ādbt depsā.
Any thoughts?