Check recency-source for model in DBT cloud

hrvoje · November 24, 2022, 1:33pm

Hi,

I tried to use recency from dbt-utils to check if data is coming as expected to source tables. For example I want to see if there is data at all in source tables for (-70 minutes from now).
But whatever parameters I use in my source.yml I always get PASS for tests done.

I am using something like this in my sources.yml:

models:
  - name: uat_demand_nosales_flatten
    tests:
      - dbt_utils.recency:
          datepart: minutes
          field: tradeHour
          interval: 70

joellabes · November 25, 2022, 1:51am

One thing that sticks out to me is that your datepart is minutes when it should probably be the singular minute. I would expect that to throw an error as opposed to silently passing though.

I would start by looking at the code in the target directory, which will include the rendered versions of your queries to see what was actually run against the database.

If you run that query you can see why no records are being returned and iterate on the query until you find the cause.

You talked about testing your sources, but your example code refers to models. Keep in mind that if you are testing sources (the original raw tables that dbt uses in its models as opposed to the ones it creates), then there is native source freshness available: freshness | dbt Developer Hub

hrvoje · November 29, 2022, 10:43am

Thank you Joel for your reply!
By checking recency on model I’m actually checking weather data is properly coming in since this model is first in line,
But I’ll be also using source freshness on source table.

Changed minute but didn’t observed any difference.
Checking target directory SQLs generated, that’s useful I’ll have to do more checks and report back.

hrvoje · December 1, 2022, 10:37am

Checking generated dbt_utils_model_name_n_ef0c3b in target folder for my case and from what I observed I see that current_timestamp used in recency is using:

select current_timestamp::timestamp_ntz which gives:
2022-12-01T02:22:07.700000-08:00 if you use timestamp_tz.

And in our model we are running refresh by hour like:
select trunc(dateadd(hour,-1,SYSDATE()),'HOUR') which currently gives
2022-12-01T09:00:00 and if timestamp_tz is used thats:
2022-12-01T09:00:00-08:00

I was checking difference between SYSDATE and CURRENT_DATE here. Although this is for Oracle I believe that same apply for Snowflake

Wouldn’t it be logical than that you should use SYSDATE in recency instead of CURRENT_DATE ?

joellabes · December 2, 2022, 1:38am

Are you saying that when you change the recency query defined here to use sysdate() instead of current_date() that you can cause your test to fail as you expect?

Each warehouse can have a specific implementation of the dbt.current_timestamp() macro. It looks like this is how the Oracle adapter has implemented it:

github.com

oracle/dbt-oracle/blob/f7fe1fd206286b780ec02238616da6a120b70552/dbt/include/oracle/macros/utils/timestamps.sql#L18-L20


      
          {% macro oracle__current_timestamp() -%}
            SYS_EXTRACT_UTC(current_timestamp)
          {%- endmacro %}

This is based on dbt utils 0.9.6 (the latest shipping version) - which version of dbt utils are you using?

Topic		Replies	Views
abort dbt job if conditions do not exist Help	10	3595	July 19, 2023
Run the model not more often than once per day. Help snowflake , dbt-core	0	102	August 13, 2024
source freshness conditional error / warning time possible? Help testing , bigquery , dbt-cloud	0	926	May 15, 2023
Running the same freshness test for all models Help freshness	2	3700	June 14, 2023
DBT Test : How to avoid missing values in source data ? Help testing , dbt-cloud	6	1517	February 11, 2024

Check recency-source for model in DBT cloud

Related topics