BigQuery JS UDF (nanoid) returns same value many times only when run via dbt incremental MERGE

The problem I’m having

I’m using a BigQuery JavaScript UDF (NanoID) to generate unique IDs in dbt models. For ~2 years it worked fine, but recently (starting around Oct 2024) some dbt runs produce the same generated ID repeated across many rows (hundreds/thousands of duplicates). When I run the compiled SQL manually in the BigQuery UI, the IDs look unique the issue seems to happen specifically when dbt executes the model

The context of why I’m trying to do this

I maintain mapping tables (e.g., study_id_mapping, variable_id_mapping) where new entities (DOIs, variable names, etc.) need a stable unique ID. dbt incremental models identify “unmapped” values and insert new (id, value) pairs. Because these tables are large (tens of millions of rows), we generate IDs on insert using a UDF rather than doing it externally.

What I’ve already tried

  1. Updated the UDF to be explicitly NOT DETERMINISTIC.
  2. Updated all dbt models to pass an argument into the UDF (no more zero-arg calls).
  3. Increased NanoID length from 10 → 12 for new IDs.
  4. Confirmed that if I run the compiled SELECT by itself in BigQuery, it appears to generate unique IDs - but dbt’s execution sometimes produces duplicates.

Some example code or error messages

// UDF
CREATE OR REPLACE FUNCTION `<project-id>.udf.generate_nanoid`(value STRING)
RETURNS STRING
NOT DETERMINISTIC
LANGUAGE js
OPTIONS (library=["gs://bigquery-js-ids/id.js"])
AS r"""
  var result = nano.generateId();
  return result;
""";


//study_id_mapping
with
  studies as (select distinct doi from `...p_abstracts`),
  oa_studies as (select distinct doi from `...all_associations`),
  all_normalized_dois as (
    select udf.normalize_doi(doi) as doi from studies
    union distinct
    select udf.normalize_doi(doi) as doi from oa_studies
  ),
  unmapped as (
    select distinct doi
    from all_normalized_dois
    where doi not in (select doi from `common.study_id_mapping`)
  )
select udf.generate_nanoid(doi) as study_id, doi
from unmapped;



// What dbt actually runs (BigQuery job query)
merge into `common.study_id_mapping` as DBT_INTERNAL_DEST
using (
  ...same CTEs...
  select udf.generate_nanoid(doi) as study_id, doi
  from unmapped
) as DBT_INTERNAL_SOURCE
on (FALSE)
when not matched then insert (`study_id`, `doi`) values (`study_id`, `doi`);

What I’m trying to understand
Has anyone seen BigQuery JS UDFs (or NanoID in JS UDFs) behave differently under dbt incremental MERGE execution vs running the compiled SELECT manually? Any known dbt-bigquery adapter behavior (parallelism, connection/job settings, MERGE shape) that could cause the UDF result to be reused unexpectedly?

There are no error messages (except for my dbt test, that checks for uniqueness)