How to config to allow new fields when running PySpark Model in GCP DataProc Serverless

The problem I’m having

When updating my python model to have more fields, PySpark job give warning that the number of fields is mismatched:
WARN BigQueryDataSourceWriterInsertableRelation: unexpected issue trying to save [col1: string, col2: timestamp … 12 more fields]
com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryException: Inserted row has wrong column count; Has 14, expected 8 at [4:30]

The context of why I’m trying to do this

We have a python model that write to a Bigquery table
PySpark Job is submit to DataProc Serverless
Problem occur when we update the model to add new fields

What I’ve already tried

  • Add properties allowFieldAddition in profiles.yml
runtime_config:
  properties: 
    allowFieldAddition: 'true'
  • Set spark config in python model
    global spark
    spark.conf.set("temporaryGcsBucket","temp_bucket")
    spark.conf.set("allowFieldAddition","true")

Some example code or error messages

Caused by: com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
GET https://www.googleapis.com/bigquery/v2/projects/*******/queries/*******************************?location=**************&maxResults=0&prettyPrint=false
{
  "code" : 400,
  "errors" : [ {
    "domain" : "global",
    "location" : "q",
    "locationType" : "parameter",
    "message" : "Inserted row has wrong column count; Has 14, expected 8 at [4:30]",
    "reason" : "invalidQuery"
  } ],
  "message" : "Inserted row has wrong column count; Has 14, expected 8 at [4:30]",
  "status" : "INVALID_ARGUMENT"
}
	at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
	at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:118)
	at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:37)
	at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:439)
	at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1111)
	at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:525)
	at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:466)
	at com.google.cloud.spark.bigquery.repackaged.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:576)
	at com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.getQueryResults(HttpBigQueryRpc.java:692)
	... 60 more
23/08/08 05:08:26 WARN BigQueryDirectDataSourceWriterContext: BigQuery Data Source writer c0f75ced-4543-4722-b974-0be9bceecc4a aborted