I have created a macro ‘chm_prod_filter’ that filters out row where a column of the form ‘str1-str2-str3’ does not have the value ‘prod-…’ as str1-component
{% macro chm_prod_filter(appid_column) %}
case when split_part({{ appid_column }}, '-', 1) = 'prod' then true
else false
end
{% endmacro %}
Now I wanted to implement a unit test to ensure that this macro works as intended. For this, I created an ephemeral model ‘prod_data_test_data’ to create test data:
{{
config(
materialized='ephemeral'
)
}}
with test_values as (
select column1 as app_id
, column2 as row_n
from values ('prod-nlz-app', 1)
, ('dev-nlz-app', 2)
, ('stage-nlz-app', 3)
, ('prod-aaz-web', 4)
, ('dev-aaz-web', 5)
, ('stage-aaz-web', 6)
)
select app_id
, row_n
from test_values
Then I created a test script for the macros ‘validate__prod_filter’ like the following:
{{
config(
materialized="table",
tags=['chm_custom']
)
}}
with test_data as (
select *
from {{ ref('prod_filter_test_data') }}
)
select app_id
, row_n
from test_data
where {{ chm_prod_filter('app_id') }}
This model runs without error and returns the expected results
| app_id | row_n|
| ------------ | ---- |
| prod-nlz-app | 1 |
| prod_aaz_web | 4 |
Now, I tried to implement a unit test, to test whether or not the macro works as expected:
version: 2
models:
- name: validate__prod_filter
description: Testing that prod filtering works correctly
unit_tests:
- name: prod_filter_test
description: Tests that the prod filter is working
model: validate__prod_filter
given:
- input: ref('prod_filter_test_data')
format: sql
rows: |
select 1 as app_id, 2 as row_n
expect:
rows:
- {app_id: prod-nlz-app, row_n: 1}
- {app_id: prod-aaz-web, row_n: 4}
but this fails with
actual differs from expected:
@@ ,APP_ID,ROW_N
---,null ,1
---,null ,4
It seems to me, that the values in app_id are not correctly used, resulting in null-values, but I do not understand why.
I tried different alternatives like selecting columns directly
select 'app_id' as app_id, 'row_n' as row_n
and also
select 1 as app_id, 2 as row_n, 'app_id' as app_id
as I thought dbt may need to know the jinja variable in the macro specifically, but nothing worked.
Many thanks for any help.
Update 1
I now understand, why it does not work thanks to debug mode. The actual do not use the data actually in the ephemeral model and are, thus, empty and in both the actual and expected, dbt tries to cast app_id as numeric. I am pretty sure it has to do with how I use the select statement…
-- Build actual result given inputs
with dbt_internal_unit_test_actual as (
select app_id
, row_n
, 'actual' as "actual_or_expected"
from ( with __dbt__cte__prod_filter_test_data as (
select 1 as app_id
, 2 as row_n
),
test_data as (
select *
from __dbt__cte__prod_filter_test_data
)
select app_id
, row_n
from test_data
where case when split_part(app_id, '-', 1) = 'prod' then true
else false
end
) _dbt_internal_unit_test_actual
)
-- Build expected result
dbt_internal_unit_test_expected as (
select app_id
, row_n
, 'expected' as "actual_or_expected"
from ( select try_cast('prod-nlz-app' as NUMBER(1,0)) as app_id
, try_cast('1' as NUMBER(1,0)) as row_n
union all
select try_cast('prod-aaz-web' as NUMBER(1,0)) as app_id
, try_cast('4' as NUMBER(1,0)) as row_n
) _dbt_internal_unit_test_expected
)
-- Union actual and expected results
select * from dbt_internal_unit_test_actual
union all
select * from dbt_internal_unit_test_expected