inserting metadata after every model

The problem I’m having

I created a macro as following:

-- Description: This macro counts the number of rows in a table and inserts the count into a metadata table
--use this macro at the end of every model you want to be included in the metadata table (using post-hook)

{% macro insert_metadata() %}
    {{log("Starting ", info=true)}}
    --query definition: count the number of rows in the table
    {% set count_query %}
        {{log("Executing count_query macro", info=true)}}
        SELECT COUNT(*) AS count FROM {{ this }}
    {% endset %}

    --excecute the queries and then
    --insert the count into the metadata table
    {% if execute %}
        {{log("Executing insert_metadata macro", info=true)}}
        {% set count_result = run_query(count_query) %}
        {{log(count_result, info=true)}}
        --if we got results from the count query, insert the count into the metadata table
        {% if count_result|length > 0 %}
            {% set count_result_value = count_result.columns[0].values()[0] %}
            {% set table_name =  get_table_name(this) %} --get the name of the table without schema and database
            {{log(count_result_value, info=true)}}
            {{log(table_name, info=true)}}
            --query definition: insert the metadata into the metadata table
            {% set insert_query %}
                INSERT INTO DD_DATASETS.METADATA (RECORDS_COUNT, HAPPENED_AT, DESCRIPTION)
                VALUES
                (
                    {{count_result_value}} , current_timestamp(), {{table_name}}
                )
            {% endset %}
            {{log("now we run", info=true)}}
            {% set insert_result = run_query(insert_query) %}
            {{log(insert_result, info=true)}}

        {% endif %}
    {% endif %}
{% endmacro %}

and I’m testing it in this tiny model:

{{
    config(
       post_hook="{{ insert_metadata() }}"
    )
}}
with cte1 as (select 1 as column1)
select * from cte1

The context of why I’m trying to do this

Hello,
We have a long pipeline of moving data between components in Kubeflow and some of the components are dbt component. We want to have some metadata about these tables that are being created in dbt, which is basically the rows count, timestamp and table name.
and all of these will be in one table called MetaData.
So my idea is to create a macro that count the rows of the table that was created through the model and insert the result to a table.
I’m thinking I will call this macro as post_hook in every model I want to have metadata about

What I’ve already tried

for now it seems to be working and inserting the data but I keep having an error always at the end of the dbt run (even though I see the table metadata is updated as the picture)

The error message:

16:36:08  Snowflake adapter: [WARNING]: Explicit transactional logic should be used only to wrap DML
logic (MERGE, DELETE, UPDATE, etc). The keywords BEGIN; and COMMIT; should be
placed directly before and after your DML statement, rather than in separate
statement calls or run_query() macros.
16:36:08  Unhandled error while executing target/run/xxxxx/models/PREP/testing_ruba.sql
cannot access local variable 'connection' where it is not associated with a value
16:36:09  1 of 1 ERROR creating sql table model DD_PREP.testing_ruba ............... [ERROR in 2.87s]
16:36:09  
16:36:09  Finished running 1 table model, 1 hook in 0 hours 0 minutes and 4.70 seconds (4.70s).
16:36:09  
16:36:09  Completed with 1 error and 0 warnings:
16:36:09  
16:36:09    cannot access local variable 'connection' where it is not associated with a value
16:36:09  
16:36:09  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

so what is this error and what is “connection” variable?
Any better idea on how to do what I want?
I would appreciate any help!

Thanks in advance :slight_smile:

another question…

how can I make sure the metadata table is always created in case it was dropped for some reason?
I don’t know how to do it now since I’m not creating it as dbt model, but only inserting to it in the macro

Hi @Ruba_AT,

Please, take a look at this docs, and then look at the last lines of this piece of your code:

-- Description: This macro counts the number of rows in a table and inserts the count into a metadata table
--use this macro at the end of every model you want to be included in the metadata table (using post-hook)

{% macro insert_metadata() %}
    {{log("Starting ", info=true)}}
    --query definition: count the number of rows in the table
    {% set count_query %}
        {{log("Executing count_query macro", info=true)}}
        SELECT COUNT(*) AS count FROM {{ this }}
    {% endset %}

    --excecute the queries and then
    --insert the count into the metadata table
    {% if execute %}
        {{log("Executing insert_metadata macro", info=true)}}
        {% set count_result = run_query(count_query) %}
    ...

You’re evaluating the execute variable before running the query. Is that ok or was a transcription issue?

Hi @hvignolo

As I understood “execute” is about if dbt is really running or it’s only in parsing phase still… so my code is saying if really dbt running, then run the query… is that correct? or am I missing something?

1 Like

I found the problem is due to having comments inside the macro definition… it seems the comments in this way

-- comment

is not seen as a comment in Jinja, which was giving the error.
Once I removed the comments, the error was gone.
Very weird and frustrating because the error was not even related to the problem or giving any hint…
anyway I wrote this here, in case someone else faces the same problem.

2 Likes

You’re right! Sorry I overlooked the comments on my first look :sweat_smile:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.