The problem I’m having
I have an external table that is sourced like this:
sources:
- name: my_name
database: my_database
schema: my_schema
tables:
- name: my_table
external:
location: "gs://my-gcs-bucket/my-path/ds=*"
partitions:
- name: ds
data_type: date
options:
format: parquet
enable_list_inference: true
In each ds=
folder, I store parquet files with names like: export_1000_to_1130.parquet
It worked fine until I have to upload smaller files and name them like: export_1000_to_1130_abcd.parquet
. I use uuid=8 to generate the suffixes of my files.
It looks like my dbt source doesn’t load the new files and only loads the old files. I’ve checked the parquet files and my data is in there. E.g. The IDs of my data. They don’t show up in my table.
I’ve checked dbt docs and couldn’t find any information about the format or naming convention of the files in my partition folders.
Any ideas?