We’ve all been there, we create a new .yml
file, and straight away type in version: 2
.
But whyyyyyy must we write this? Is there any other version allowed?
There once was! Back in 2018 (when schema.yml
files had to be called schema.yml
, and a time before dbt docs
was a thing!), your schema.yml
files looked like this:
customers:
constraints:
not_null:
- id
- email
- favorite_color
unique:
- id
- email
accepted_values:
- field: favorite_color
values: ['blue', 'green']
- field: likes_puppies
values: ['yes']
orders:
constraints:
not_null:
- id
WILD!
We had to throw this structure out when we introduced dbt docs
, since there was no place to put descriptions:
.
Now, this would look like:
version: 2
models:
- name: customers
description: this is the customers model and also not a useful description
columns:
- name: id
description: primary key of the model
tests:
- unique
- not_null
- name: email
tests:
- unique
- not_null
- name: favorite_color
tests:
- accepted_values:
values: ['blue', 'green']
- names: likes_puppies
description: this should be true for everyone
tests:
- accepted_values:
values: ['yes']
- name: orders
columns:
- name: id
tests:
- not_null
^ more lines of code, but easier to reason about one column at a time, and to add a description:
field in.
We had to add the version: 2
so dbt knew how to read your yaml files!
These days, we don’t support the original yaml structure, but we keep the version: 2
, so that one day, if we need to go to a version: 3
, the code we need to write in dbt to figure out whether you’re on v2 or v3 is easy for us to write!
This is a pattern used in other tools too — Circle CI does a similar thing, so do AWS IAM JSON policies!
S/O to all the community members who used the v1 structure