As senior oracle database developer with +20 experience working in complex business supported in oracle and batch jobs, i kept finding most batch LT/TL of inbound files jobs would fail as sql dml hard exception due to errors in data file).
Meanwhile I jumped into cloud data engineering and been very curious how the zillions of open sources handle a database sql exception.
I understand the basics of dbt macro/template idea and found it interesting, but i wonder whats recommended way loading file with good and bad records.
The context of why I’m trying to do this
In batch job loading file with 1 million rows, imagine one record causes exception in a dml statement (e.g. INSERT/SELECT).
What I’ve already tried
Oracle databases support this amazing well by supplying a 2nd target table where rows causing exceptions will get inserted, just add LOG <exceptions_table> as part of an Oracle SQL statement like INSERT/SELECT …LOG myExceptionsTable. But most databases not supporting this and given this is one of main causes of breaking pipelines when loading files - specially from external sources - So i wonder whats dbt pattern for this.
yeah - I have a similar recent post in “Can dbt do this…?” I cannot find where there is excepton handing in the dbt model architecture. When I write snowflake stored procedures, I can do anything.
Blockquote
Question: How to handle the problem of sql exception in a dml statement using dbt
Answer from ChatGPT: dbt (Data Build Tool) is a popular open-source tool for building, testing, and deploying data pipelines. Here are some common ways to handle exceptions in a dbt data pipeline:
Use the on_error Configuration: The on_error configuration in a dbt model allows you to specify what action should be taken in the event of an error during the execution of a model. The available options are fail, warn, and ignore. For example, if you set on_error: fail, dbt will stop executing the pipeline if an error is encountered during the execution of a model.
Write Custom Macros: Macros in dbt allow you to encapsulate common logic and reuse it across multiple models. You can write custom macros to handle specific exceptions in your pipeline and ensure that the pipeline continues to run smoothly.
Use the {% exception %} Template Tag: The {% exception %} template tag allows you to handle exceptions within your dbt templates. For example, you can use the {% exception %} tag to catch any exceptions that occur during the execution of a template and handle them appropriately.
Logging and Monitoring: dbt has built-in logging capabilities, which allow you to capture any error messages that occur during the execution of a pipeline. You can also use third-party tools to monitor your dbt pipelines and receive notifications in the event of an error.
Test Your Models: Writing comprehensive tests for your dbt models can help you catch and resolve exceptions before they reach production. dbt includes built-in support for testing and allows you to write tests using SQL or Python.
By following these best practices, you can effectively handle exceptions in your dbt data pipeline and ensure that your pipeline continues to run smoothly.