I’m attempting to connect to an SFTP site to upload and download a csv file. If I run code in a dbt Python model, I get a “Device is busy” error message. If I run the same Python code in Snowflake, everything works. What I found was that Snowflake inherently can’t talk outside of itself, so you have to create an EXTERNAL ACCESS INTEGRATION within Snowflake for any stored procedure to communicate to an SFTP site. I’m struggling to figure out how to pass that EXTERNAL ACCESS INTEGRATION to Snowflake through a dbt Python model. If that’s not possible, what are my other options? Maybe my assumption of the “Device is busy” error code is wrong.
Some example code or error messages
import paramiko
import pandas as pd
import io
import csv
def model(dbt, session):
dbt.config(
schema = 'p_icarol',
database = 'production',
packages = ["pandas==2.0.3", "paramiko==2.8.1"],
materialized = "table"
)
dfr = pd.DataFrame({
'column1': [1, 2, 3],
'column2': ['a', 'b', 'c']
})
# Convert DataFrame to CSV format in memory
csv_buffer = io.StringIO()
dfr.to_csv(csv_buffer, index=False)
csv_buffer.seek(0) # Rewind the buffer to the beginning of the file
# SFTP Connection Information
sftp_host = 'HOST'
sftp_port = 22 # Default SFTP port
user_name = 'USERNAME'
pass_word = 'PASSWORD'
remote_file_path = '/path/to/file/testing.csv' # Remote file path on SFTP server
# Initialize the SSH client
ssh_client = paramiko.SSHClient()
ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
# Connect to the SFTP server
ssh_client.connect(hostname=sftp_host, port=sftp_port, username=user_name, password=pass_word)
# Initialize SFTP client
sftp = ssh_client.open_sftp()
# Upload the CSV from the in-memory buffer
with sftp.open(remote_file_path, 'w') as remote_file:
remote_file.write(csv_buffer.getvalue())
# Close connections
csv_buffer.close()
sftp.close()
ssh_client.close()
return None