The problem I’m having
Hi there, I’m following the documented instruction to setup a local Spark instance (GitHub - dbt-labs/dbt-spark: dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks) and connect to it with dbt, but I’m having a connection issue.
What I’ve tried
When I execute a dbt debug
with the suggested docker setup and profile, dbt returns:
12:33:20 Connection:
12:33:20 host: 127.0.0.1
12:33:20 port: 10000
12:33:20 cluster: None
12:33:20 endpoint: None
12:33:20 schema: analytics
12:33:20 organization: 0
12:33:20 Registered adapter: spark=1.9.0
12:33:27 Spark adapter: Warning: No message, retrying due to 'retry_all' configuration set to true.
Retrying in 60 seconds (0 of 5)
12:34:27 Spark adapter: Warning: No message, retrying due to 'retry_all' configuration set to true.
Retrying in 60 seconds (1 of 5)
When inspecting the logs of the dbt-spark-dbt-spark3-thrift-1
container, I find this error:
25/01/30 13:09:33 ERROR SparkExecuteStatementOperation: Error executing query with 7f4f90c5-e4a8-4a00-bdc2-dda5ce0b3424, currentState RUNNING,
java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(Lorg/antlr/v4/runtime/ParserRuleContext;Lscala/Function0;)Ljava/lang/Object;
at org.apache.spark.sql.parser.HoodieSqlCommonAstBuilder.visitSingleStatement(HoodieSqlCommonAstBuilder.scala:37)
at org.apache.spark.sql.parser.HoodieSqlCommonAstBuilder.visitSingleStatement(HoodieSqlCommonAstBuilder.scala:31)
at org.apache.hudi.spark.sql.parser.HoodieSqlCommonParser$SingleStatementContext.accept(HoodieSqlCommonParser.java:107)
at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:18)
at org.apache.spark.sql.parser.HoodieCommonSqlParser.$anonfun$parsePlan$1(HoodieCommonSqlParser.scala:42)
at org.apache.spark.sql.parser.HoodieCommonSqlParser.parse(HoodieCommonSqlParser.scala:84)
at org.apache.spark.sql.parser.HoodieCommonSqlParser.parsePlan(HoodieCommonSqlParser.scala:41)
at org.apache.spark.sql.SparkSession.$anonfun$sql$2(SparkSession.scala:620)
at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:620)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:651)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:291)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.runInternal(SparkExecuteStatementOperation.scala:216)
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:277)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkOperation$$super$run(SparkExecuteStatementOperation.scala:43)
at org.apache.spark.sql.hive.thriftserver.SparkOperation.$anonfun$run$1(SparkOperation.scala:45)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:79)
at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:63)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
at org.apache.spark.sql.hive.thriftserver.SparkOperation.run(SparkOperation.scala:45)
at org.apache.spark.sql.hive.thriftserver.SparkOperation.run$(SparkOperation.scala:43)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(SparkExecuteStatementOperation.scala:43)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:484)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:460)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:71)
at org.apache.hive.service.cli.session.HiveSessionProxy.lambda$invoke$0(HiveSessionProxy.java:58)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:58)
at com.sun.proxy.$Proxy40.executeStatement(Unknown Source)
at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:280)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:456)
at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:52)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:310)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
25/01/30 13:09:33 INFO DAGScheduler: Asked to cancel job group 7f4f90c5-e4a8-4a00-bdc2-dda5ce0b3424
25/01/30 13:09:33 INFO SparkExecuteStatementOperation: Close statement with 7f4f90c5-e4a8-4a00-bdc2-dda5ce0b3424
25/01/30 13:09:33 WARN ThriftCLIService: Error executing statement:
org.apache.hive.service.cli.HiveSQLException: Error running query: java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(Lorg/antlr/v4/runtime/ParserRuleContext;Lscala/Function0;)Ljava/lang/Object;
at org.apache.spark.sql.hive.thriftserver.HiveThriftServerErrors$.runningQueryError(HiveThriftServerErrors.scala:44)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:325)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.runInternal(SparkExecuteStatementOperation.scala:216)
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:277)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkOperation$$super$run(SparkExecuteStatementOperation.scala:43)
at org.apache.spark.sql.hive.thriftserver.SparkOperation.$anonfun$run$1(SparkOperation.scala:45)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:79)
at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:63)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
at org.apache.spark.sql.hive.thriftserver.SparkOperation.run(SparkOperation.scala:45)
at org.apache.spark.sql.hive.thriftserver.SparkOperation.run$(SparkOperation.scala:43)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(SparkExecuteStatementOperation.scala:43)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:484)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:460)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:71)
at org.apache.hive.service.cli.session.HiveSessionProxy.lambda$invoke$0(HiveSessionProxy.java:58)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:58)
at com.sun.proxy.$Proxy40.executeStatement(Unknown Source)
at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:280)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:456)
at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:52)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:310)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(Lorg/antlr/v4/runtime/ParserRuleContext;Lscala/Function0;)Ljava/lang/Object;
at org.apache.spark.sql.parser.HoodieSqlCommonAstBuilder.visitSingleStatement(HoodieSqlCommonAstBuilder.scala:37)
at org.apache.spark.sql.parser.HoodieSqlCommonAstBuilder.visitSingleStatement(HoodieSqlCommonAstBuilder.scala:31)
at org.apache.hudi.spark.sql.parser.HoodieSqlCommonParser$SingleStatementContext.accept(HoodieSqlCommonParser.java:107)
at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:18)
at org.apache.spark.sql.parser.HoodieCommonSqlParser.$anonfun$parsePlan$1(HoodieCommonSqlParser.scala:42)
at org.apache.spark.sql.parser.HoodieCommonSqlParser.parse(HoodieCommonSqlParser.scala:84)
at org.apache.spark.sql.parser.HoodieCommonSqlParser.parsePlan(HoodieCommonSqlParser.scala:41)
at org.apache.spark.sql.SparkSession.$anonfun$sql$2(SparkSession.scala:620)
at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:620)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:651)
at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:291)
... 35 more
25/01/30 13:09:33 ERROR TThreadPoolServer: Thrift error occurred during processing of message.
org.apache.thrift.transport.TTransportException
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374)
at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451)
at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433)
at org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:43)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:52)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:310)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
25/01/30 13:09:33 INFO ThriftCLIService: Session disconnected without closing properly, close it now
Local environment
For the docker-compose file, I’ve tried both the current main branch of dbt-labs/dbt-spark , as well as the latest release (v1.9.0). To run dbt, I’m on Python 3.12.8, with dbt-core 1.9.1 and dbt-spark 1.9.0.
Googling
When Googling this error, the most similar issue is Solved: java.lang.NoSuchMethodError after upgrade to Datab... - Databricks Community - 16005 .