How can we access AWS S3 buckets from DBT

shylashree.dr · February 14, 2024, 6:23pm

Hello Everyone!!
I’m working on a use case where I need to directly interact with S3 buckets to read files, such as CSVs, Parquet files, or other data formats stored in S3. Are there any best practices or recommended approaches for integrating S3 data into dbt models and transformations?

Any insights, experiences, or recommendations would be greatly appreciated! Thank you in advance for sharing your knowledge with the community.

a_slack_user · February 14, 2024, 6:34pm

duckdb supports working directly with s3 and can work with dbt as well
https://duckdb.org/docs/guides/import/s3_import

_{Note: @Gio originally posted this reply in Slack. It might not have transferred perfectly.}

prajaktachitale · February 29, 2024, 4:15pm

Can you please provide the example of using DuckDB with AWS and dbt?
Also, is it supported by both core and cloud versions?

svdimchenko · March 6, 2024, 6:09pm

You can use GitHub - dbt-athena/dbt-athena: The athena adapter plugin for dbt (https://getdbt.com) adapter for that, it supports both hive and iceberg tables currently.

Another option may be GitHub - aws-samples/dbt-glue: This repository contains de dbt-glue adapter, it may be costly but better performance for huge data volumes.

Actually it depends on your current infra and toolkit available as then some other options may be also possible.

Topic		Replies	Views
Reading external model files Archive	0	1930	March 3, 2020
About the Help category Help	2	2275	January 21, 2024
AWS Athena in DBT Cloud In-Depth Discussions	0	1583	June 23, 2023
Deployment in secure environments In-Depth Discussions orchestration-and-deployment	1	8143	September 15, 2019
How to Backup Snowflake Data to S3 Archive	0	3706	October 12, 2019

How can we access AWS S3 buckets from DBT

Related topics