Issue with dbt Unit Tests for BigQuery: Handling of ARRAY<STRING> Columns in Input Fixtures as CSV

amolvetal · June 30, 2025, 9:53am

The problem I’m having

We have encountered discrepancies between Spark and BigQuery when using dbt unit tests. Specifically, these issues arise when handling input fixture data for columns of type ARRAY. Below is a more detailed explanation with examples to illustrate the problems.

The context of why I’m trying to do this

We are working on a dbt project migration from Spark to BigQuery . To ensure that our models behave consistently across both platforms, we rely on dbt unit tests with mock input data. However, discrepancies in how Spark and BigQuery handle ARRAY<STRING> columns in input fixtures are making it difficult to write and execute these tests effectively.

In Spark:
- If an input column of type ARRAY<STRING> is unused in the test, it can be omitted without causing any issues.
- When a column is omitted, Spark handles it gracefully by interpreting it as CAST(NULL AS ARRAY<STRING>).
In BigQuery:
- If a column is not mocked, BigQuery attempts to generate a SAFE_CAST(NULL AS ARRAY) statement, which results in a syntax error because BigQuery does not allow casting NULL to an array without specifying the type.

Root Cause

The differences in behavior appear to stem from differences in how Spark and BigQuery handle NULL values and array data during compilation and execution:

Spark allows implicit casting of NULL to ARRAY<STRING> without requiring explicit type information.
BigQuery requires explicit type information when casting NULL to an array,

What I’ve already tried

I tried to do all the way to mock the data to csv with different ways like below.

1.Tried with “[‘a’,‘b’] or “[””“a”“”, “”“b”“”]"
2.Tried to setup the column type in schema file of dbt with data_type as array
3.Tried safe_cast in model like ( tags as array) as tags
4. Tried with the yml style which works fine.

Some example code or error messages

id,tags
1,"['a', 'b', 'c']"
2,"['x', 'y']"
3,NULL

Topic		Replies	Views
DBT Unit test - Issue/Question on data format on fixture CSV files - Array of string Help testing , dbt-core	1	214	March 11, 2025
DBT Databricks Unit tests - expect array type column fails Help databricks , dbt-core	0	161	December 20, 2024
Workaround for arrays not supported in unit test expected output Help testing , dbt-core	3	796	June 17, 2024
DBT erroring on query that works in BigQuery (JSON_QUERY) Help	2	1151	February 12, 2024
A dynamic schema test for nested & repeated BigQuery fields Show and Tell testing , bigquery	2	17365	May 12, 2023