The problem I’m having
I created a macro as following:
-- Description: This macro counts the number of rows in a table and inserts the count into a metadata table
--use this macro at the end of every model you want to be included in the metadata table (using post-hook)
{% macro insert_metadata() %}
{{log("Starting ", info=true)}}
--query definition: count the number of rows in the table
{% set count_query %}
{{log("Executing count_query macro", info=true)}}
SELECT COUNT(*) AS count FROM {{ this }}
{% endset %}
--excecute the queries and then
--insert the count into the metadata table
{% if execute %}
{{log("Executing insert_metadata macro", info=true)}}
{% set count_result = run_query(count_query) %}
{{log(count_result, info=true)}}
--if we got results from the count query, insert the count into the metadata table
{% if count_result|length > 0 %}
{% set count_result_value = count_result.columns[0].values()[0] %}
{% set table_name = get_table_name(this) %} --get the name of the table without schema and database
{{log(count_result_value, info=true)}}
{{log(table_name, info=true)}}
--query definition: insert the metadata into the metadata table
{% set insert_query %}
INSERT INTO DD_DATASETS.METADATA (RECORDS_COUNT, HAPPENED_AT, DESCRIPTION)
VALUES
(
{{count_result_value}} , current_timestamp(), {{table_name}}
)
{% endset %}
{{log("now we run", info=true)}}
{% set insert_result = run_query(insert_query) %}
{{log(insert_result, info=true)}}
{% endif %}
{% endif %}
{% endmacro %}
and I’m testing it in this tiny model:
{{
config(
post_hook="{{ insert_metadata() }}"
)
}}
with cte1 as (select 1 as column1)
select * from cte1
The context of why I’m trying to do this
Hello,
We have a long pipeline of moving data between components in Kubeflow and some of the components are dbt component. We want to have some metadata about these tables that are being created in dbt, which is basically the rows count, timestamp and table name.
and all of these will be in one table called MetaData.
So my idea is to create a macro that count the rows of the table that was created through the model and insert the result to a table.
I’m thinking I will call this macro as post_hook in every model I want to have metadata about
What I’ve already tried
for now it seems to be working and inserting the data but I keep having an error always at the end of the dbt run (even though I see the table metadata is updated as the picture)
The error message:
16:36:08 Snowflake adapter: [WARNING]: Explicit transactional logic should be used only to wrap DML
logic (MERGE, DELETE, UPDATE, etc). The keywords BEGIN; and COMMIT; should be
placed directly before and after your DML statement, rather than in separate
statement calls or run_query() macros.
16:36:08 Unhandled error while executing target/run/xxxxx/models/PREP/testing_ruba.sql
cannot access local variable 'connection' where it is not associated with a value
16:36:09 1 of 1 ERROR creating sql table model DD_PREP.testing_ruba ............... [ERROR in 2.87s]
16:36:09
16:36:09 Finished running 1 table model, 1 hook in 0 hours 0 minutes and 4.70 seconds (4.70s).
16:36:09
16:36:09 Completed with 1 error and 0 warnings:
16:36:09
16:36:09 cannot access local variable 'connection' where it is not associated with a value
16:36:09
16:36:09 Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1
so what is this error and what is “connection” variable?
Any better idea on how to do what I want?
I would appreciate any help!
Thanks in advance
another question…
how can I make sure the metadata table is always created in case it was dropped for some reason?
I don’t know how to do it now since I’m not creating it as dbt model, but only inserting to it in the macro