I couldn’t find any place to post comments directly on the blog post, but I wanted to call out a big problem with group by 1
that the author didn’t cover.
group by 1
makes the order of columns in your select
clause semantically meaningful! As in, changing the order of the columns in your select
clause could change your output if you use group by 1
.
That’s such a dangerous line to cross that it totally kills any value I might see in using group by 1
. Maintaining SQL that uses this pattern become a dangerous exercise. You reorder some columns just because it looks better to you, thinking it’s a harmless change, but oops! The output is all different now.
The author writes:
Plus, as someone that writes a lot of SQL, I very much appreciate the fact that
1
is exactly one character.
I think this reflects a typical divide in programming styles and patterns. Some are optimized for the person writing the code, and some are optimized for the person reading and maintaining the code. group by 1
is very much a benefit to the writer, and a harm to the reader and maintainer.
The author opens her post with a choice between two styles of writing a query. But she misses a third style:
with enhanced as (
select
date_trunc('month', order_created_at) as order_month,
*
from core.fct_orders
)
select
order_month,
count(order_id) as n_orders,
sum(revenue) as total_revenue
from enhanced
group by order_month;
If you want not to repeat yourself in your group by
clause, then common table expressions are the way to go. For the sake of maintainability, I’d steer clear of group by 1
.