I couldn’t find any place to post comments directly on the blog post, but I wanted to call out a big problem with
group by 1 that the author didn’t cover.
group by 1 makes the order of columns in your
select clause semantically meaningful! As in, changing the order of the columns in your
select clause could change your output if you use
group by 1.
That’s such a dangerous line to cross that it totally kills any value I might see in using
group by 1. Maintaining SQL that uses this pattern become a dangerous exercise. You reorder some columns just because it looks better to you, thinking it’s a harmless change, but oops! The output is all different now.
The author writes:
Plus, as someone that writes a lot of SQL, I very much appreciate the fact that
1is exactly one character.
I think this reflects a typical divide in programming styles and patterns. Some are optimized for the person writing the code, and some are optimized for the person reading and maintaining the code.
group by 1 is very much a benefit to the writer, and a harm to the reader and maintainer.
The author opens her post with a choice between two styles of writing a query. But she misses a third style:
with enhanced as ( select date_trunc('month', order_created_at) as order_month, * from core.fct_orders ) select order_month, count(order_id) as n_orders, sum(revenue) as total_revenue from enhanced group by order_month;
If you want not to repeat yourself in your
group by clause, then common table expressions are the way to go. For the sake of maintainability, I’d steer clear of
group by 1.