The problem I’m having
Unable to execute macro which brings in column names of table from INFORMATION_SCHEMA.TABLES and use it as value for check_cols in the snapshot model configuration.
The context of why I’m trying to do this
As part of my data model, I need to first create snapshots of source tables and use them as basis for rest of processing. However, I need to exclude some of the source columns from snapshot . I need to do this for all the source tables (10+ tables). Hence, I created a macro that can only get columns I require. In event of changes at source i.e. addition of more columns, I don’t want to come back again and revise the logic and let the macro handle it.
What I’ve already tried
I can get this working by putting in hard coded column values. To get the column values dynamically, I have created macro which is causing failure.
Some example code or error messages
Macro I created - get_column_names_excl_meta_cols
{%- macro get_column_names_excl_meta_cols(schema_name, table_name) -%}
{% set table_query %}
select
string_agg("'" || column_name || "'")
from
{%- if target.name == 'prod' -%}
gcp-project-prod.{{ schema_name }}.INFORMATION_SCHEMA.COLUMNS
{%- elif target.name == 'test' -%}
gcp-project-test.{{ schema_name }}.INFORMATION_SCHEMA.COLUMNS
{%- else -%}
gcp-project-dev.{{ schema_name }}.INFORMATION_SCHEMA.COLUMNS
{%- endif -%}
where table_name = '{{ table_name }}'
and column_name not in ('meta_ingest_ts')
{% endset %}
{%- set result = run_query(table_query).columns[0].values()[0] -%}
{{return( result )}}
{%- endmacro -%}
my snapshot model using the above macro
{% snapshot store_hist %}
{{
config(
target_schema=generate_schema_name('stores'),
strategy='check',
invalidate_hard_deletes=True,
unique_key='store_no',
check_cols=get_column_names_excl_meta_cols('stores', 'store_master')
)
}}
with base as (
select
*
from {{ src_table('stores', 'store_master') }}
)
select *
from base
{% endsnapshot %}
Command I am running - dbt snapshot --select store_hist
.
Error
Compilation Error in snapshot store_hist (snapshots\store_hist.sql)
‘None’ has no attribute ‘table’
Alternatives explored
I have encapsulated if execute command in my macro get_column_names_excl_meta_cols
as follows
{% if execute %}
{%- set result = run_query(table_query).columns[0].values()[0] -%}
{{return( result )}}
{% else %}
{{return( false ) }}
{% endif %}
Putting that in when I run the dbt snapshot --select store_hist
command I am getting the following error. I don’t fully understand how this command works. Reading docs, I realize (or am assuming) that this gets utilized when dbt run command works and using that prevents parsing error but I don’t know how it works in the context of dbt snapshot
command. Hence, I removed that if execute
condition.
Can someone please guide me on this? I am using dbt-core 1.5, dbt-bigquery 1.5 version and using dbt cli.
Please let me know any additional details required. I have also replaced string_agg function with array_agg (as that retrieves the value as array) but to no avail.
Parsing Error at path [‘check_cols’]:
False is not valid under any of the given schemas