Multi-node cluster running Databricks Runtime 12.2 LTS ML (Apache Spark 3.3.2, Scala 2.12)
dbt run our models, even before the actual queries themselves run it seems to parse the entirety of the schema tables, and although the model itself takes only 2 seconds to run, the parsing portion takes over 15 mins. Our schema has over 3500+ tables, and for governance purposes we do not have the ability to create our own dedicated schema as recommended by the dbt documentation.
We have tried some cache related CLI options such as
--no-populate-cache but it seems to have no impact.
What configurations can we make besides creating our own separate schema, in order to shorten this 15 min schema parsing?
For context I have been following this GitHub issue for some time, waiting for a solution: