Attribute - Convention Nomenclature

jmilhomem · May 14, 2019, 3:59pm

Hi community!

We are changing our data stack and considering some other improvements, and one of them is regarding the convention nomenclature.
I’d like to know from you how do you do the convention nomenclature for attributes! Could you share your inputs regarding it, pls?

Here is what we are considering about it:

Thanks!!

drew · May 14, 2019, 4:30pm

Hey @jmilhomem - it’s great that you’re thinking hard about these conventions! In software, we refer to this type of naming scheme as hungarian notation

I don’t use regimented conventions like this when building out my data models - I instead like to use more contextual prefixes/suffixes, though it’s certainly not super well-defined on paper anywhere. I do something like:

- is_<boolean>
- <timestamp>_at
- first_<timestamp>_at
- last_<timestamp>_at
- count_<number>
- <entiy>_id

So, not too dissimilar from your approach I think!

The thing that jumps out at me about your proposed conventions is that it mixes types and “kinds”. Whereas nm_ and dt_ denote a database type, sk_ and ds_ denote a kind of string. I think that’s totally fine if it works for you, but I just wanted to point it out

mplovepop · May 16, 2019, 2:24pm

I like is_* and has_* for booleans, and *_at for timestamps. Beyond that we are pretty inconsistent, mostly because my thoughts keep evolving. At the moment I’m inclined to agree with <entity>_id though we have an awful lot of primary keys just named id and our analysts are used to that. What I don’t like about it is that I’ve had several cases of junior analysts just joining all tables by id instead of understanding the foreign keys, but that is mostly a matter of training and unfamiliarity with sql. Still, explicit is better than implicit.

For counts I like *_count, feels more English language intuitive to me. For money I use an explicit currency_code column if one exists, and suffix with *_usd if it doesn’t.

When there is a compound key I create a surrogate by concatenation and then almost always just name it row_key to indicate it shouldn’t join to anything.

My overriding concerns are understandability of the data model, and analyst ergonomics, so I shy away from making final tables feel “programmer-y” with prefix and suffix notation like in your example.

drew · May 16, 2019, 2:37pm

Quick update: I said

I instead like to use more contextual prefixes/suffixes, though it’s certainly not super well-defined on paper anywhere.

It turns out that @claire defined this on paper somewhere! These are just our internal conventions at Fishtown Analytics, but feel free to borrow from them as you see fit

github.com

dbt-labs/corp/blob/main/dbt_coding_conventions.md#naming-and-field-conventions

## 🚧 This page has moved!

Historically this page housed the venerable, beloved, occassionally-controversial-to-leading-commas-enthusiasts dbt Labs Style Guide. We've updated and expanded that to a [new home on the dbt Developer Hub](https://docs.getdbt.com/guides/best-practices/how-we-style/0-how-we-style-our-dbt-projects), with recommendations on how you can create and enforce a style guide at your organization.

Topic		Replies	Views
Sharing my SQL Style Guide Show and Tell best-practice	9	6696	December 17, 2020
Ensuring Naming Conventions/Types? Archive	2	2946	May 16, 2022
Table name convention: enhanced or enriched? Help	0	635	May 11, 2023
Stakeholder-friendly model names: Model naming conventions that give context In-Depth Discussions devblog	2	2107	February 24, 2023
Model and column naming best practices Archive	3	7145	October 9, 2019

Attribute - Convention Nomenclature

Related topics