The problem I’m having
We’re trying to write unit tests for a Snowflake-based dbt model that uses ARRAY_CONTAINS() to check for ancillaries. The model works in production, but unit tests fail with type mismatch errors when trying to mock array columns.
Error:
Database Error
001044 (42P13): SQL compilation error: error line 164 at position 13
Invalid argument types for function 'ARRAY_CONTAINS': (VARCHAR(14), ARRAY)
The context of why I’m trying to do this
Model: int_order_details.sql
{{ config(materialized="ephemeral") }}
select
order_id,
flight_route,
added_ancillaries, -- ARRAY column
carrier_codes, -- ARRAY column
case
when array_contains('Insurance', added_ancillaries) then 1 else 0
end as has_insurance,
case
when year(created_date) >= 2021
then coalesce(service_fee_raw, 0)
else 2.30 -- Legacy fee
end as service_fee
from {{ ref("base_orders") }}
What I’ve already tried
Approach 1. Inline SQL with array_construct() in YAML
I used array_construct() directly as below, also tried with array_construct(“”)::array, to_array(parse_json(‘[]’)). But got the error above.
unit_tests:
- name: test_legacy_fee
model: int_order_details
given:
- input: ref('base_orders')
format: sql
rows: |
select 'ORDER_001' as order_id,
'JFK-LAX' as flight_route,
array_construct('Insurance', 'FastTrack') as added_ancillaries,
array_construct('AA') as carrier_codes,
'2020-12-31'::date as created_date,
5.00 as service_fee_raw
expect:
rows:
- { order_id: "ORDER_001", service_fee: 2.30, has_insurance: 1 }
Approach 2. YAML Row Format with Array Literals
I followed this approach based on recommendation on a similar issue on dbt unit test. Got the same error as above.
given:
- input: ref('base_orders')
rows:
- order_id: 'ORDER_001'
flight_route: 'JFK-LAX'
added_ancillaries: ['Insurance', 'FastTrack']
carrier_codes: ['AA']
created_date: '2020-12-31'
service_fee_raw: 5.00
Approach 3: Ephemeral Helper Model
I created a separate model with mock data and called this model in the unit test
ut_base_orders_mock_data.sql (mock data model )
{{ config(materialized="ephemeral", tags=["unit_test_helper"]) }}
with mock_data as (
select
'ORDER_001' as order_id,
'JFK-LAX' as flight_route,
array_construct('Insurance', 'FastTrack') as added_ancillaries,
array_construct('AA') as carrier_codes,
'2020-12-31'::date as created_date,
5.00 as service_fee_raw
)
select * from mock_data
unit test file
model: int_order_details
given:
- input: ref('base_orders')
format: sql
rows: |
select * from ut_base_orders_mock_data
where order_id = 'ORDER_001'
Current Hypothesis
In all approaches, when the unit test framework materializes the mock data, array_construct() or YAML array literals (['item']) appear to be treated as VARCHAR instead of ARRAY type when passed to the model’s SQL logic.
The ARRAY_CONTAINS() function in the model then receives:
-
First argument: String literal
'Insurance' -
Second argument: VARCHAR (should be ARRAY)
Questions
-
Is there a way to explicitly cast array columns in unit test mock data?
-
Does dbt’s unit test framework support complex Snowflake types like ARRAY/OBJECT?
-
Should we be using a different testing pattern for models with array operations?
Will really appreciate your help. Thanks so much.
