Apologies for reposting, the original contained too much information about our schema.
Here are our two macros, heavily simplified but including the date logic we need. One UDF just overloads the other.
{% macro udf_stocks_and_valuations_as_of_date() %}
CREATE OR REPLACE FUNCTION {{ target.schema }}.udf_stocks_and_valuations_as_of(as_of DATE) RETURNS TABLE (
company_id BIGINT,
vested_shares BIGINT,
total_common_shares BIGINT,
) AS
$$ WITH
stock_certs AS (
SELECT * FROM {{ ref('stocks') }}
),
cap_tables AS (
SELECT * FROM {{ ref('cap_tables') }}
),
companies AS (
SELECT * FROM {{ ref('companies') }}
),
stock_calcs AS (
SELECT
*,
CASE
WHEN vesting_period = 0
THEN num_shares
WHEN
EXTRACT(YEAR FROM AGE(as_of, issued_on)) * 12 + EXTRACT(MONTH FROM AGE(as_of, issued_on)) < 12
THEN 0
ELSE
(num_shares / (12 * vesting_period)) *
(EXTRACT(YEAR FROM AGE(as_of, issued_on)) * 12 + EXTRACT(MONTH FROM AGE(as_of, issued_on)))
END AS vested_shares
FROM stock_certs
),
final AS (
SELECT
companies.company_id,
SUM(CASE
WHEN stock_calcs.share_class = 'common'
THEN stock_calcs.vested_shares
END) AS vested_shares,
SUM(CASE
WHEN stock_calcs.share_class = 'common'
THEN stock_calcs.num_shares
END) AS total_common_shares
FROM stock_calcs
INNER JOIN cap_tables ON cap_tables.cap_table_id = stock_calcs.cap_table_id
INNER JOIN companies ON cap_tables.company_id = companies.company_id
GROUP BY companies.company_id
),
SELECT * FROM final
$$ LANGUAGE SQL;
{% endmacro %}
{% macro udf_stocks_and_valuations_as_of_month() %}
CREATE OR REPLACE FUNCTION {{ target.schema }}.udf_stocks_and_valuations_as_of(yyyy INT, mm INT) RETURNS TABLE (
company_id BIGINT,
vested_shares BIGINT,
total_common_shares BIGINT,
) AS $$
SELECT *
FROM {{ target.schema }}.udf_stocks_and_valuations_as_of((make_date(yyyy, mm, 1) + INTERVAL '1 month' - INTERVAL '1 day')::DATE)
$$ LANGUAGE SQL;
{% endmacro %}
I have an additional macro with a conditional based on one of the comments in the original post. I can’t just call the udf creation macros outright, because they they’ll run on a dbt seed
, when the referenced tables don’t exist.
-- idempotently create all Postgres UDFs
{% macro create_udfs() %}
-- Only run this if the outermost DBT command is "run"
-- (allows us to use on-run-end rather than manual run-operations on container spinup)
{% if flags.WHICH == "run" %}
{{ udf_stocks_and_valuations_as_of_date() }}
{{ udf_stocks_and_valuations_as_of_month() }}
{% else %}
select 1 as id
{% endif %}
{% endmacro %}
Then we have an on-run-end hook in dbt_project.yml
.
on-run-end:
- '{{ create_udfs() }}'
I think this bug can be recreated just by:
- Creating a macro that calls
{{ref()}}
- Calling that macro from another macro within a conditional
- Calling the outer macro as on-run-end (or, likely, on-run-start)