schema.yml reading a view as a source

For the yml file, are sources view type that can be used as a source? Our source systems have logical views, and we don’t want to duplicate the complex logic in dbt schema.yml to avoid sync issues. It seems that yml doesn’t have a views type option.

Models are materializing as a view, so it compiles a nested view reading a view or CTA reading a view. Hence, it produces an unnecessary nested view redundancy. Is there another better practice?

Thanks,
Laura B

Sample schema.yml

sources:
  - name: table label
    schema: schema name
    tables:
      - name: database table name

Welcome @lbert!

Does this mean you are building your dbt project inside of the same production database where the data is created, as opposed to using an ETL tool to move the data into a separate analytics warehouse?

Can you give an example of what you mean here? There won’t be any complex transformation logic in a yaml file, and there is no yaml in a view definition so I don’t understand what logic you’re worried about duplicating.

In the dbt paradigm, it’s extremely common to have a stack of views reading from one another, as this enables you to break up your modelling logic into individual steps. Taken to an extreme this results in performance issues, so we recommend limiting the number of views chained together without a table along the way, but otherwise this isn’t an antipattern or something you need to worry too much about.

Thank you for your response. There is a table_type option. The data provider owns the source view, and it contains business logic. DBT project can read a view vs creating a model that uses view and then creates a view of view.

scheme.yml
sources:

  • name: bi
    schema: bi
    tables:
    • name: fact_email_funnel
      identifier: fact_email_funnel
      table_type: table
      description: email campaign engagement metrics
    • name: fact_clickstream
      identifier: fact_clickstream
      table_type: view
      description: web clickstream events

Can you link me to the dbt documentation where you found this setting? I don’t see it in the source properties documentation and have never heard of it.

Regardless, you can query sources that are views if you want to. Our modelling best practice strongly recommends using a staging layer to ensure that you only directly access sources once (and then build other models on top of the staging layer), but you can query them directly if you want to remove a layer of abstraction.

If you choose to go down that path, keep in mind that dbt projects pretty much always have views stacked on top of each other. It’s not clear to me why it’s important to you that the source not be a view on top of a view, when your downstream models are very very likely to be views on top of views anyway.

I understand that one of DBT mainframe work is stacked views. In the past, we have seen performance issue with nested views. Sometimes phyical tables are the best options.

The great benefit of views is having business logic can be change quickly with litle impact and quick turnaround for the stackholders and little backfill efforts.

1 Like