How I Used ChatGPT to Auto-Generate dbt Model Descriptions from SQL Logic

Hello

One small but impactful improvement I have made to our dbt workflow is automating the generation of model descriptions by using ChatGPT to analyze the underlying SQL. :innocent:Descriptions are often neglected / rushed; yet they’re incredibly important for data discovery and documentation.

I built a simple Python script that sends the contents of .sql model files to the OpenAI API & returns concise, human-readable descriptions, which I then inject into schema.yml. :slightly_smiling_face:

This saves a lot of manual writing and helps maintain consistency across the team. It’s particularly useful when onboarding new team members / reviewing legacy models. :thinking:

I have set up the script to run locally but I’m exploring adding it to our CI pipeline so model documentation stays fresh with every commit. Has anyone else tried something similar or taken it a step further with test or exposure generation? :thinking: I checked About documentation | dbt Developer Hub guide for reference .

When a teammate asked me what is ChatGPT , this small tool I hacked together turned out to be the best example of practical usage turning raw SQL into structured, documented knowledge. :slightly_smiling_face:Sharing here in case others want to improve model documentation workflows using LLMs.

Thank you !! :slightly_smiling_face: