unit_test: wrong data_type length

I have an issue with the data type length in a unit test. The database is Snowflake.

Unit test basis

  • the tested model has as source the view ‘TECH’.
create or replace view TECH_SCHEMA.TECH(
	TECH_ID,
	SYSTEM,
	STATUS,
	INSERTED_AT
) as
select  TECH_ID,
        SYSTEM,
        STATUS,
        TECH_INSERTED_AT
from    TECH_SCHEMA.TECH_HIST
qualify ROW_NUMBER() OVER (PARTITION BY TECH_ID
                     ORDER BY "INSERTED_AT" DESC) = 1
;
  • this view ‘TECH’ is based on the table ‘TECH_HIST’:
create or replace TABLE TECH_SCHEMA.TECH_HIST (
	TECH_ID NUMBER(38,0) NOT NULL,
	SYSTEM VARCHAR(16777216),
	STATUS NUMBER(4,0) DEFAULT 0,
	INSERTED_AT TIMESTAMP_NTZ(9) DEFAULT CURRENT_TIMESTAMP()
);
  • view ‘TECH’ and table ‘TECH_HIST’ are created and maintained outside dbt

In the unit test the input for ‘status’ in the source ‘TECH’ is set to ‘1000’:

    given:
      - input: source('TECH_SCHEMA', 'TECH')
        rows:
          - {tech_id: '1', system: 'test_system', tech_status: '1000', inserted_at: '1999-01-01 00:00:00'}

Unit test run

However, the compiled code for the source ‘TECH’ looks like:

-- Fixture for TECH
select 
    
        try_cast('1' as NUMBER(38,0))
     as tech_id, 
    
        try_cast('test_system' as character varying(16777216))
     as system, 
    
        try_cast('1000' as NUMBER(3,0))
     as status,
	
        try_cast('1999-01-01 00:00:00' as TIMESTAMP_NTZ)
     as tech_inserted_at_ts

I have no clue why ‘status’ is of type NUMBER(3,0) here and not NUMBER(4,0) as it is defined in the table. - This, of course, leads to a unit_test fail.
Anyone any tips / hints / ideas what might go wrong here?

dbt tests infer data types automatically which explains why you’re seeing this issue. There’s a couple possible paths forward and they all rely on setting an explicit data type on that column in a yaml file. There is an open github issue for data types in testing: https://github.com/dbt-labs/dbt-core/issues/9884

source configuration:

sources:
  - name: TECH_SCHEMA
    tables:
      - name: TECH
        columns:
          - name: tech_id
            data_type: number(38,0)
          - name: system
            data_type: varchar(16777216)
          - name: status
            data_type: number(4,0)  # Explicit definition
          - name: inserted_at
            data_type: timestamp_ntz(9)```
model contract
```models:
  - name: your_model
    config:
      contract:
        enforced: true
    columns:
      - name: status
        data_type: number(4,0)```
Try that and see if it helps. More on data types here: <https://docs.getdbt.com/reference/resource-properties/data-types>

<sub>Note: `@Abigail Green (CHG Healthcare)` originally [posted this reply in Slack](https://getdbt.slack.com/archives/CBSQTAPLG/p1752852041200739?thread_ts=1752834469.771399&cid=CBSQTAPLG). It might not have transferred perfectly.</sub>