I’m wondering what is the best practice of where to do this casting?
For example, my first step is always to create a base_
table that renames columns. Seems like that would be the right place to do the casting so that the types automatically propagate to downstream queries.
Another consideration is that when you CAST something you always have to specify the column name, which can lead to a lot of repeating as in Josh’s example above.
That could probably be solved through a macro. I checked dbt_utils and don’t see anything like that yet.
So then the structure could be something like:
with source as (
select * from stg.ga_page_stats
),
casted as (
select
mcast(date, date),
mcast(pagePath, varchar(3000)),
mcast(sourceMedium, varchar(512)),
mcast(campaign, varchar(512)),
mcast(keyword, varchar(512)),
mcast(adContent, varchar(512)),
mcast(fullReferrer, varchar(512)),
mcast(pageviews, int),
mcast(uniquePageviews, int)
from source
),
renamed as (
select
date as date,
pagePath as page_path,
sourceMedium as traffic_source_medium,
campaign as traffic_campaign,
keyword as traffic_keyword,
adContent as traffic_content,
fullReferrer as full_referrer,
pageviews as pageviews,
uniquePageviews as unqiue_pageviews
from casted
),
Or am I overcomplicating things here?