Transform complex json using dbt

I’m currently working on a dbt transformation for JSON data stored in a Snowflake table, and I’m facing some challenges. The JSON structure involves nested arrays, and I need to extract specific information to create a structured table. Here’s a simplified example of the JSON structure:

{
    "eventid": 123,
    "consumerID": 123,
    "metadata": {
        "sessionid": "555",
        "timestamp": "2024-02-22T09:18:05.848",
        "vtnr": {
            "vtnr_dat": "606",
        },
    },
    "data": [{
            "vehicledata": {
                "vehicleid": 125,
                "vehicletype": "B",
            },
            "tariff": [{
                    "tariffid": 1,
                    "pricekh": 12.43,
					"pricetk": null,
                    "year": 2002,
                    "kh": true,
                    "tk": null,
                }, {
                    "tarrifid": 1,
                    "pricekh": null,
                    "pricetk": 20.28,
					"year": 2002,
                    "kh": null,
                    "tk": null,
                }
            ]
        }, {
            "vehicledata": {
                "vehicleid": 126,
                "vehicletype": "B",
            },
            "tariff": [{
                    "tariffid": 5,
                    "pricekh": 10.00,
                    "pricetk": null,
					"year": 2022,
                    "kh": true,
                    "tk": null,
					"vk": null
                }, {
                    "tariffid": 5,
                    "pricekh": null,
                    "pricetk": 59.28,
					"pricevk": null,
                    "kh": null,
                    "tk": true,
					"vk": null
                }
            ]
        }
    ]
}

At the end I should get the following table and results:

sessionid	timestamp				vtnr_dat	vehicleid	vehicletype	tariffid	year	kh		tk		vk		pricekh	pricetk	pricevk
555			2024-02-22T09:18:05.848	606			125			B			1			2002	true	true	null	12.43	20.28	null
555			2024-02-22T09:18:05.848	606			126			B			5			2022	true	true	null	10.00	59.00	null

I tried to exctract metadata and information about vehicles with lateral flatten but I couldn’t find the solution how to combine that with more objects for tariff in a single row of data.

Any idea how to solve this transformation?

I think you have two options. If the inner array, the tarrifs, always contains exactly two items, one for tk and one for kh, you might not need to flatten it because you can just dip in and get the correct values.

Or alternatively, you could flatten the outer array first, and then the second array, and then you would need to re-aggregate the table to have one per vehicle instead of one row per vehicle-tariff

Note: @Mike Stanley originally posted this reply in Slack. It might not have transferred perfectly.

1 Like