![]() I noticed the other answers were using Spark Standalone (on VMs, as mentioned by OP or 127.0.0.1 as other answer). This timeout is controlled by Īt .$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)Īt .RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)Īt .RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)Īt (AbstractPartialFunction.scala:36)Īt $$anonfun$recover$1.apply(Try.scala:216)Īt (Try.scala:216)Īt $$anonfun$recover$1.apply(Future.scala:326)Īt .run(Promise.scala:32)Īt org.spark_.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)Īt $$anon$1.execute(ExecutionContextImpl.scala:136)Īt .executeWithValue(Promise.scala:40)Īt $圜omplete(Promise.scala:248)Īt $plete(Promise.scala:55)Īt $plete(Promise.scala:153)Īt $$anonfun$map$1.apply(Future.scala:237)Īt $Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:63)Īt $Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:78)Īt $Batch$$anonfun$run$1.apply(BatchingExecutor.scala:55)Īt $.withBlockContext(BlockContext.scala:72)Īt $n(BatchingExecutor.scala:54)Īt $InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)Īt $class.execute(BatchingExecutor.scala:106)Īt $InternalCallbackExecutor$.execute(Future.scala:599)Īt $yFailure(Promise.scala:112)Īt $yFailure(Promise.scala:153)Īt .$apache$spark$rpc$netty$NettyRpcEnv$$onFailure$1(NettyRpcEnv.scala:205)Īt .netty.NettyRpcEnv$$anon$1.run(NettyRpcEnv.scala:239)Īt $RunnableAdapter.call(Executors.java:511)Īt .run(FutureTask.java:266)Īt $ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)Īt $n(ScheduledThreadPoolExecutor.java:293)Īt .runWorker(ThreadPoolExecutor.java:1149)Īt $n(ThreadPoolExecutor.java:624)Ĭaused by: : Cannot receive any reply in 120 seconds UPD: Stack trace from the executor: ERROR RpcOutboxMessage: Ask timeout before connecting successfullyĮxception in thread "main" Īt .UserGroupInformation.doAs(UserGroupInformation.java:1713)Īt .nAsSparkUser(SparkHadoopUtil.scala:66)Īt .CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)Īt .CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:284)Īt .CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)Ĭaused by: .RpcTimeoutException: Cannot receive any reply in 120 seconds. So, how can I configure Spark to communicate with the driver program using the host machine IP address rather than Docker container address? I tried to set to the IP address of the host machine, but got same errors. WARN Utils: Service 'sparkDriver' could not bind on port 5001. When trying to set to the IP address of the host machine, I get errors like this: It does not take into account SPARK_PUBLIC_DNS environment variable - is this correct? In the container, I cannot set to anything else except the container "internal" IP address (172.17.0.2 in this example). This IP is not accessible from the worker machine, therefore the worker is not able to communicate to the driver program.Īs I see from the source code of StandaloneSchedulerBackend, it builds driverUrl using setting: val driverUrl = RpcEndpointAddress(ĬoarseGrainedSchedulerBackend.ENDPOINT_NAME).toString "-driver-url" is actually the address of the container with the driver program, not the host machine where the container is running. ![]() From an executor log on a worker machine, I see that the executor has the following driver URL: My standalone Spark cluster runs on 3 virtual machines - one master and two workers. ![]() I run spark-submit with a fat jar inside a Docker container. The Docker image is here: docker-spark-submit
0 Comments
Leave a Reply. |