The problem I’m having
I am using a dbt Snowpark Python model. I am loading in a column from a Snowflake table which contains strings of WKT in EPSG:27700. I am attempting to convert these strings to a valid geometry column in EPSG:4326 using GeoPandas and Shapely.
I am getting the error message:
pyproj.exceptions.ProjError: x, y, z, and time must be same size
in function INT_INTERPRET_GEOMETRY__DBT_SP with handler main
I am wondering whether anyone else has attempted to convert geometries in this way yet?
What I’ve already tried
I have tested this in a ‘regular’ Python and it runs without errors. The conversion itself, as well as the data, work in a Python script. I can confirm the issue is not related to corrupted/badly formatted WKT. The code should be fine too, at least form a GeoDataFrame/shapely perspective.
In the ‘regular’ script I use shapely==2.0.1, geopandas==0.12.2, pyproj==3.4.1 which appear to be the same versions which Snowpark uses according to Snowflake Snowpark for Python.
I checked python 3.x - ProjError with GeoPandas (x, y, z and time must be the same size) - Stack Overflow, which seems to suggest the issue is with the Pyproj package.
It seems the highest version of Pyproj supported in Snowpark Anaconda is 3.4.1, so an update to 3.5.0 is not possible.
Some example code or error messages
def model(dbt, session):
dbt.config(materialized = "table",
packages = ["geopandas","shapely","pandas","pyproj"]
)
snowpark_df = dbt.ref("int_filter_permit")
pandas_df = snowpark_df.toPandas()
pandas_df["geometry"] = pandas_df["geometry"].apply(loads)
pandas_df = gpd.GeoDataFrame({'geometry': pandas_df['geometry']}, crs='epsg:27700')
pandas_df['geometry'] = pandas_df['geometry'].to_crs('epsg:4326') # <---where error happens
output_df = session.create_dataframe(pandas_df) # to convert back to snowpark df
return output_df
Some of the geometry I am trying to convert:
POINT(430321 266217)
POINT(448377.903762817 128067.056248932)
POINT(538406.477 169287.031)
LINESTRING(412103.414583034 291618.949591734,412115.938221884 291605.191230884)
POINT(518642.39 260744.78)
POINT(505611 206201)
LINESTRING(255009.885759349 134187.792095766,254937.173259349 134168.104595766)
POINT(524349.31 169787.73)
LINESTRING(527351.487518646 184044.237073459,527346.778326752 184041.806523838)
LINESTRING(563540.9 263540.8,563542.7 263538.9,563544.5 263536.7,563546.5 263533.2,563549.9 263527.6,563553.1 263519.9,563558.9 263506.3,563560.6 263499.7,563562.8 263488.6,563564.1 263479.1,563564 263471.3,563564.2 263450.7,563562.4 263442.3,563560.9 263433,563560 263429.1,563556.4 263419.8,563552.6 263409.2,563550.2 263401.5,563547.9 263397.6,563546.1 263395.3,563535.6 263384.6,563527.8 263377.9,563503.4 263357.7)
POINT(426602.55 535624.74)
LINESTRING(484392.823607431 197052.027106416,484390.001378218 197049.381265773)
Does anyone know why I would be getting this error message?