I am aware of the limitations of Python models in dbt and the pros and cons in comparison to SQL models. However, the main draw to our team of using dbt is that our core DE team is Python-first, and we also have a number of consultants who are SQL-first. The ability for these two subsets of developers on our team to collaborate is what has us exploring our use case for dbt Cloud with their sales team.
Since the release of Python models, however, I have not been able to find discussions around the progress on some key limitations around Python models. I understand there is some internal decision-making that needs to take place on the part of dbt’s team with regard to design patterns, but would like to know if anyone here is aware of dbt’s plans moving forward for Python model support, namely:
When code reuse will be supported. The main advantage for Python in DE is modularity. While dbt advances modularity at a higher-level with data entities being encapsulated as models, the inability to modularize processes like column-level transformations, column renaming, etc. across models is quite a big hole for our use-case.
When the restrictions on materializations may be lifted, if at all. From my rather cursory understanding of dbt, there is certain functionality introduced into the codebase that remain specific to a DW provider. For example, I believe the dbt_snowflake adapter supports “dynamic_table” as a materialization for SQL-based models in Snowflake. If this is the case, why can materializations not be expanded for Snowflake Python models given the existence of snowflake.snowpark.DataFrame.create_or_replace_view (to support view) and snowflake.snowpark.DataFrame.create_or_replace_dynamic_table (to support dynamic tables) methods? I understand the complexity with supporting ephemeral materializations with regard to cross-language models, but I do not understand why a Snowflake Python model cannot be materialized as a view.
Type-hinting and IDE experience. This one is a little less critical but makes development a bit more complicated than needed. Why does dbt not offer a package to import from to type hint the dbt argument so that the IDE can offer auto-complete suggestions? Our work around is to have our own module in our dbt project to import a symbolic class from that allows us to see the methods available on the DataFrame, but we do have to comment out the import and remove the type-hint to run the models, which feels a little hacky. I know we can just define a symbolic class on each model file, but that again introduces redundancy that Python is supposed to solve for.