Macro running automatically after a test

Hi guys,

I’m a Data Analyst intern at a jewelry company, and I use dbt with BigQuery as part of my work.

For several days I have been stuck on the same problem so I will gladly accept help from more experienced people.

In BigQuery I have a table containing fields associated with the data of the products we sell. For example, the article reference, the weight of the product, the country of origin etc.

Each field has specific business rules, and I am in charge of automating these business rules.

For this, I use dbt tests. I have 14 in total, but I’m only going to share 2 to explain the problem I’m having.

In the tests folder of my dbt project I have the test_article_family.sql file:

WITH article_family_check AS (
    SELECT
        a.article AS article_number,
        b.article_family AS article_family,
        CASE
            WHEN article_family IS NULL THEN 'Article Family must not be empty'
            ELSE NULL
        END AS business_rule
    FROM bigqueryproject.projectdataset.table AS a
    JOIN {{ ref('tbsd_mara') }} AS b
        ON a.article = b.article
)

SELECT
    article_number,
    article_family,
    CASE
        WHEN business_rule IS NOT NULL THEN 'FAIL'
        ELSE 'PASS'
    END AS test_article_family, business_rule
FROM article_family_check

As well as the test_article_number.sql file:

WITH article_number_check AS (
    SELECT article AS article_number,
        CASE
            WHEN LENGTH(article) != 10 AND NOT STARTS_WITH(article, 'VC')
            THEN 'Article Number must be 10 characters long and start with VC'
            WHEN LENGTH(article) != 10 THEN 'Article Number must be 10 characters long'
            WHEN NOT STARTS_WITH(article, 'VC') THEN 'Article Number must start with VC'
            ELSE NULL
        END AS business_rule
    FROM bigqueryproject.projectdataset.table
)

SELECT article_number,
    CASE
        WHEN business_rule IS NOT NULL THEN 'FAIL'
        ELSE 'PASS'
    END AS test_article_number, business_rule
FROM article_number_check

I made a macro (tests_results_logs.sql) that runs automatically when a test has been executed, and which creates a “dbt_tests_logs” table in my BigQuery project containing 2 columns:

  • test_name retrieved via the test output (result.node.name): the column therefore contains either test_article_family or test_article_number when I issue the “dbt test” command.

  • computed_on_timestamp which contains the timestamp of the test execution.

Here it is :

{% macro tests_results_logs(results) %}

  {%- set test_results = [] -%}
    {%- for result in results -%}
      {%- if result.node.resource_type == 'test' and result.status != 'skipped' -%}
        {%- do test_results.append(result) -%}
      {%- endif -%}
    {%- endfor -%}

  {%- set tests_results_logs_table = target.schema ~ '.dbt_tests_logs' -%}

  IF NOT EXISTS (SELECT 1 FROM projectdataset.INFORMATION_SCHEMA.TABLES WHERE table_name = 'dbt_tests_logs')
  THEN
    CREATE TABLE {{ tests_results_logs_table }} (
      article_number STRING,
      test_name STRING,
      computed_on_timestamp TIMESTAMP
    );
  END IF;

  DELETE FROM {{ tests_results_logs_table }} WHERE 1=1;

  {%- for result in test_results -%}
    INSERT INTO {{ tests_results_logs_table }} VALUES (
      -- ...
      "{{ result.node.name }}",
      CURRENT_TIMESTAMP()
    );
  {%- endfor -%}
{% endmacro %}

I would like to add a third column “article_number” which contains the same thing as that of the test_article_family.sql and test_article_number.sql files.

As the test_name column can contain two values, each item code will be present in duplicate. If the article_number column contains “VCARKT4349”, then there will be two rows where VCARKT4349 will have test_article_family as the value of test_name, and another row where VCARKT4349 will have test_article_number as the value of test_name.

How do I modify my macro to be able to do this?

Thank you in advance for your help.