Hello friendly people!
I am migrating MSSQL (on premise) code to an Azure stack that includes dbt. dbt is fantastic, but I am not sure how I can translate the following code to something that dbt accepts:
,CTE_ToBeMapped AS (
SELECT sa.*
,LTRIM(RTRIM(VALUE)) AS AnswerValue_StringSplit
FROM {{ ref('database__full__table') }} sa
INNER JOIN CTE_MappingExportIdToBeMapped m
ON sa.QuestionKeyCoalesce = m.QuestionKeyOldMC
OUTER APPLY STRING_SPLIT(AnswerValue, '|')
)
Before this CTE, each QuestionKey can hold multiple AnswerValues, such as ‘Answer1|Answer2|Answer3’. This CTE is supposed to generate one row for each question-answer pair, so that the above example (one question with three answers) results in three rows (one question with one answer on each row).
For more context, the full query up to this point. I translated everything to something understandable in English, but the CHARINDEX line might be confusing; the QuestionKeys are normally in format ‘QUESTION’, but if they are multiplechoice questions from one specific system, they would be ‘QUESTION_1’, ‘QUESTION_2’, et cetera, with each holding only one answer in its AnswerValue. The other system, however, would have ‘QUESTION’ with all the possible answers in its AnswerValue. I need to translate the ‘QUESTION’ ‘Answer1|Answer2’ variant to the ‘QUESTION_1’ ‘Answer1’, ‘QUESTION_2’, ‘Answer2’ variant.
--The mapper, filtered on relevant multiplechoice rows
WITH CTE_MappingExportId AS (
SELECT DISTINCT ExportId AS QuestionKey6
,LEFT(ExportId, LEN(ExportId) - CHARINDEX('_',REVERSE(ExportId))) AS QuestionKeyOldMC
,QuestionText AS QuestionText2
,LTRIM(RTRIM(AnswerValues)) AS AnswerValue3
FROM {{ ref('database__seed__mapper') }}
WHERE (Type = 'Multiplechoice' OR Type = 'Multiple answers possible')
)
--Shortlist of old mapping codes of the multiple choice codes (i.e. the code without _1, _2, et cetera)
,CTE_MappingExportIdToBeMapped AS (
SELECT DISTINCT QuestionKeyOldMC
FROM CTE_MappingExportId
)
--The rows that can be mapped
--We have a set with only multiplechoice codes from the mapper
--And from this set we can safely use string split to turn every answer into a single row
,CTE_ToBeMapped AS (
SELECT sa.*
,LTRIM(RTRIM(VALUE)) AS AnswerValue_StringSplit
FROM {{ ref('database__full__table') }} sa
INNER JOIN CTE_MappingExportIdToBeMapped m
ON sa.QuestionKeyCoalesce = m.QuestionKeyOldMC
OUTER APPLY STRING_SPLIT(AnswerValue, '|')
)
I am not entirely sure what dbt cannot handle. I suspect it is the STRING_SPLIT, but possibly also the OUTER APPLY. I could in theory work around the OUTER APPLY (although it would make the code very messy), but I do not see how I can work around STRING_SPLIT. I tried playing around with macros, but, my knowledge of dbt is just too limited to give meaningful examples of what didn’t work here. I found only one similar example in this forum, but that is for a static ‘split left and split right’ function, while I could in theory need to generate ten new lines because I might have ten multiplechoice answers.
I hope someone can help me translate the above code to something dbt accepts - thank you very much in advance!