Template for complex incremental models

Background:
I’m looking for a replacement of a >20yr old propriatery development tool that offers a lot of functionality similar to dbt. Over time we’ve developed a lot of features tailored to our needs, most that I’m happy to throw away, but one in particular that would be a deal breaker.

Our propriatery tool is better equipped to handle more complex incremental models that dbt has out-of-the-box. If we are to migrate this is something we need to adress.

Question:
I have listed one example ‘use-case’ below. From the functional perspecifive we have one model, however we’d like to implement that in separate steps or multiple technical models without needing to code the technical template.

I’m seeking for a strategy that would allow us to generate these as a template. I can image run_query and/or hooks may do the trick but it feels rather hacky to do so. On the other hand I’m relatively new to dbt so I don’t know all the possibilities. I’m looking forward to your suggestions, thank you.

Example:

Given the source data model:

Table1:
This is a big fact table for which we need incremental processing
PK: Key1, Key2, ValidDate,
Other fields: value, isCurrent*, changeMarker*

  • marks technical use only

Table2:
This is a table fed indirectly through user input
PK: Key1, OutputKey, ValidFrom
Other fields: ValidUntil

The developer that implements the business logic gives the following configuration to declare the model.

Config: {
	Input: [
		Table1: {
			TimedimensionType: Snapshot, 
			Key: [Key1, Key2]
		},
		Table2: {
			TimedimensionType: Interval, 
			Key: [Key1, OutputKey]
		}
	],
	JoinClause: [
		Table1.Key1 = Table2.Key1
	],
	Output:{
		TimedimensionType: Snapshot, 
		Keys: [
			{
				inExpr: Table2.OutputKey, 
				outAlias: OutputKey
			}
		],
		Values: [
			{
				inExpr: MEDIAN(Table1.Value), 
				outAlias: MedianValue
			}
		],
		Aggregation: True
	}
]

A full execution (in develop/debug) mode will compile into code like

CREATE OR REPLACE TABLE Output AS
SELECT Table2.OutputKey OutputKey, Table1.ValidDate, MEDIAN(Table1.Value) SumValue
FROM Table1
JOIN Table2 ON (Table1.Key1 = Table2.Key1 AND Table2.ValidFrom <= Table1.ValidDate AND Table1.ValidDate < Table2.ValidUntil)
WHERE Table1.isCurrent ='Y'
GROUP BY Table2.OutputKey, Table1.ValidDate

However in production we’d like incremental execution instead. More or less accomplished by the following steps.
Please note that I simplified for the sake of the example, I am not seeking for feedback on inaccuracies nor inefficiencies.

Fetch lastProcessedChangeMarker
Fetch newChangeMarker

MERGE INTO Table1_Interval_helper tgt USING 
	(SELECT Key1, MIN(ValidDate) ValidFrom, MAX(ValidDate+1) ValidUntil
	 FROM   Table1
	 WHERE isCurrent ='Y' AND changeMarker > lastProcessedChangeMarker) src
ON (src.key1 = tgt.key1)
WHEN NOT MATCHED THEN INSERT (tgt) values (src)
WHEN MATCHED THEN UPDATE SET tgt.ValidFrom = LEAST(tgt.ValidFrom, src.ValidFrom) AND tgt.ValidUntil = GREATEST(tgt.ValidUntil, src.ValidUntil)

CREATE OR REPLACE TABLE Output_join_helper_current AS
SELECT Key1, Key2, OutputKey, GREATEST(Table1.ValidFrom, Table2.ValidFrom ) ValidFrom , LEAST(Table1.ValidUntil, Table2.ValidUntil ) ValidUntil 
FROM   Table1_Interval_helper Table1
INNER JOIN Table2 ON (Table1.Key1 = Table2.Key1 AND Table2.ValidFrom < Table1.ValidUntil AND Table1.ValidFrom < Table2.ValidUntil)

CREATE OR REPLACE TABLE Output_table_deltas AS
WITH DELTA_INTERVALS_HELPER AS
		(SELECT Key1, Key2, OutputKey, ValidFrom, ValidUntil
		FROM Output_join_helper_current
		MINUS -- Actually we de a strict interval logic based minus here
		SELECT Key1, Key2, OutputKey, ValidFrom, ValidUntil
		FROM Output_join_helper_previous),
	DELTA_HELPER AS
		(SELECT OutputKey, ValidDate
		FROM DELTA_INTERVALS_HELPER, DATE_ITERATOR(ValidFrom, ValidUntil)),
	DELTA_TABLE1 AS
		(SELECT OutputKey, ValidDate
		FROM Table1
		JOIN Output_join_helper_current ON (Table1.Key1 AND Output_join_helper_current.ValidFrom < Table1.ValidUntil AND Table1.ValidFrom < Output_join_helper_current.ValidUntil)
		WHERE changeMarker > lastProcessedChangeMarker)
SELECT *
FROM   DELTA_HELPER
UNION 
SELECT * 
FROM   DELTA_TABLE1

MERGE INTO Output tgt
USING (SELECT delta.OutputKey, delta.ValidDate, MEDIAN(CASE WHEN Table1.Key1 IS NOT NULL AND Table2.Key1 IS NOT NULL THEN value END), CASE WHEN
SUM(CASE WHEN Table1.Key1 IS NOT NULL AND Table2.Key1 IS NOT NULL THEN 1 END) > 0 THEN 'Y' ELSE 'N' isCurrent, newChangeMarker ChangeMarker
       FROM Output_table_deltas delta
	   LEFT OUTER JOIN Table2 ON (delta.outputKey =Table2.OutputKey AND Table2.ValidFrom <= delta.ValidDate AND delta.ValidDate < Table2.ValidUntil)
	   LEFT OUTER JOIN Table1 ON (Table1.key1 = Table2.Key1 AND Table1.ValidDate = delta.ValidDate AND Table1.isCurrent=Y)) src
ON (tgt.OutputKey = src.OutputKey AND tgt.ValidDate = src.ValidDate)
WHEN NOT MATCHED INSERT (tgt) VALUES (src)
WHEN MATCHED UPDATE tgt = src

–Post administration

DROP TABLE Output_join_helper_previous;
RENAME TABLE Output_join_helper_current TO Output_join_helper_previous;

Store newChangeMarker as lastProcessedChangeMarker

Does nobody have any ideas/guidelines on how to approach this ?