Curious how analytics teams actually manage column-level documentation in practice.
Where do descriptions and business definitions usually live?
dbt docs, a catalog, spreadsheets, somewhere else?
And if someone had to document a few hundred columns, what workflow would they realistically use?
The teams I’ve seen keep sane usually make the dbt YAML the source of truth, then get disciplined about what actually deserves column-level docs.
A few patterns that seem to hold up:
- keep business definitions close to the model in schema.yml, not in a separate spreadsheet,
- only require detailed docs for exposed or high-traffic columns, not every intermediate field,
- add a PR check so new columns do not land undocumented by accident,
- use templated wording for common audit fields so people are not rewriting the same definition 40 times.
For a few hundred columns, I would not try to document them all in one pass. I’d do it by domain and start with the tables people actually query. Otherwise you end up with a lot of stale prose nobody trusts.
If a catalog tool exists, I’d treat it as the display layer, not the authoring layer.