Greenplum again! CTEs are not great in greenplum

jamesg · January 17, 2019, 10:32pm

Hi Folks,

Bit of a spanner here for me, using Greenplum 4.x. There are two query planners in Greenplum, one called the legacy planner which is very clunky, and the super (not so super) “optimizer”. The optimizer reverts to the legacy planner when things get to complicated. Unfortunately neither of these can manage to rewrite CTEs sensibly. Take a query with some base models in CTEs and run them through the optimizer and it will “revert” to the legacy planner and the result is very expensive and doesn’t run. If you copy and paste the CTEs into subqueries, the optimizer doesnt revert and can do some amazing tricks!

So how difficult would it be to add another materialisation strategy “subquery”, which is like ephemeral except writes the dependent models into subqueries instead of CTEs?

(All of the above may or may not be an issue in GP5.x, but I won’t be able to test that until later in the year.)

-James

drew · January 17, 2019, 10:56pm

Hey @jamesg - this would actually be a great question for an issue in GitHub! We’ll be better able to discuss potential features of dbt over there.

My instinct is that a subquery materialization might be a little complicated, but happy to explore what it could look like in the issue!

jamesg · January 17, 2019, 11:09pm

Thanks @drew, no problem: https://github.com/fishtown-analytics/dbt/issues/1248

elexisvenator · January 18, 2019, 12:35am

Greenplum is based on postgres, does it keep the postgres behaviour of treating CTEs as an optimisation fence? If it does then I would recommend avoiding CTEs entirely regardless of the query planner.

jamesg · January 18, 2019, 4:05am

When selecting the query optimizer in GP4.x (set optimizer=on) then the CTEs may or may not end up being materialized internally (fenced) depending on complexity. With dbt you can easily end up with many CTEs and higher complexity and therefore it gets too complex and reverts to legacy mode and “fences” them.

Topic		Replies	Views
CTEs are Passthroughs--Some research! In-Depth Discussions	15	20269	February 7, 2024
Why the Fishtown SQL style guide uses so many CTEs Archive	14	38110	March 24, 2022
Can you add the incremental logic into a CTE? Help incremental , postgres , dbt-core	3	295	February 12, 2025
Should I move my CTEs into separate models? In-Depth Discussions	1	8394	June 21, 2023
🚀 Introducing `cte2dbt`: effortlessly convert SQL CTE into dbt models Show and Tell dbt-utils	0	345	February 22, 2025

Greenplum again! CTEs are not great in greenplum

Related topics