Do you test ephemeral models?

dylanbaker · August 21, 2018, 11:18am

We test all (probably really most) of our views and tables that sit in our analytics schema, which is all the data that is deemed to be ready for use.

We currently don’t have much testing on downstream models that aren’t materialised, particularly ‘base’ ephemeral models that have no joins and effectively just cleans up the individual raw tables. We’ve started doing it, which has massively increased the number of tests of project has (>1000), and I wanted to know what people’s views on this are.

It’s starting to take much longer for the tests to run as well. We currently run tests after each production refresh of the tables, which may not be necessary.

Questions:

Do you test all your ephemeral/down-stream models?
Do people have a specific setup with all of this?
Is there a best practice?

mplovepop · August 24, 2018, 5:03pm

I like testing everything, including ephemerals. It’s a handy development tool. It does take quite some time to run all tests though.

josh · February 11, 2019, 9:35pm

Agreed, we like testing ephemeral models. Thinking behind this:

We create ephemeral models for other downstream models to use.
The downstream models are invariably making a bunch of assumptions about the data coming out of the ephemeral models.
It can be hard or even impossible to reliably test all of the assumptions made by the ephemeral models indirectly in the downstream models. Worst-case scenario is that an assumption is violated, but all of the downstream models and tests function properly and incorrect data gets used in analytics and then incorrect decisions are made based on these incorrect analytics.
Therefore we should test as many assumptions as we can about ephemeral models in an automated fashion.

Yes, it does cause more tests to run which can cause a slowdown. One solution could be tagging or only running subsets of tests at various times.

Also, from a dbt perspective it seems like not such a great idea to being attached to models being ephemeral. The whole point is to easily be able to flip between model types - ephemeral, view, and table as performance or other needs dictate. If we design something with the hardcoded assumption that it will always and only be ephemeral we’re probably building in some technical debt or bugs that will pop up in the future.

gesara · May 2, 2024, 7:40pm

Hello! I wanted to test an ephemeral model using dbt_expectations.expect_column_to_exist, but it failed. So, it looks like for ephemeral model this does make sense. Did you try this? Thank you in advance.

Topic		Replies	Views
What is the best way to break up a model for easier unit test? In-Depth Discussions	0	97	September 10, 2024
Is it possible to have incremental logic in ephemeral tables? Help incremental , ephemeral-models	2	5951	November 6, 2019
Issue with snapshots running from int tables that are tables and not ephemeral Help	0	401	February 27, 2024
Testing incremental models In-Depth Discussions testing , incremental	3	9424	April 20, 2021
Need Help Creating Temporary Copy of Base Table Before Update for Tests Help	0	477	February 9, 2024

Do you test ephemeral models?

Related topics