gpt4 book ai didi

java - 仅在运行jar文件时发生空指针异常 - Scala Spark

转载 作者:行者123 更新时间:2023-12-02 09:46:38 24 4
gpt4 key购买 nike

我有一个大数据分析项目。我使用 Spark 并使用 Scala 进行编写。

当我使用 sbt run 运行该项目时,它运行良好,并给出了我想要的结果。之后,我使用 sbt assembly 构建了 jar 文件,并使用 java -jar my.jar 运行它。但进程停止了并给了我一个空指针异常。

谁能解释一下为什么会发生这种情况?请。

我附上了堆栈跟踪以供引用。

2019-06-15 18:49:56 DEBUG BlockManager:58 - Getting local block broadcast_0
2019-06-15 18:49:56 DEBUG BlockManager:58 - Level for block broadcast_0 is StorageLevel(disk, memory, deserialized, 1 replicas)
2019-06-15 18:49:57 INFO CodecPool:179 - Got brand-new decompressor [.gz]
2019-06-15 18:49:57 DEBUG TaskMemoryManager:427 - unreleased 8.0 MB memory from org.apache.spark.sql.catalyst.expressions.VariableLengthRowBasedKeyValueBatch@43d7b5b2
2019-06-15 18:49:57 DEBUG TaskMemoryManager:427 - unreleased 256.0 KB memory from org.apache.spark.unsafe.map.BytesToBytesMap@6aee5557
2019-06-15 18:49:57 DEBUG TaskMemoryManager:434 - unreleased page: org.apache.spark.unsafe.memory.MemoryBlock@3517be49 in task 0
2019-06-15 18:49:57 DEBUG TaskMemoryManager:434 - unreleased page: org.apache.spark.unsafe.memory.MemoryBlock@295da0ba in task 0
2019-06-15 18:49:57 ERROR Executor:91 - Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.NullPointerException
at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:153)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:102)
at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.<init>(HadoopFileLinesReader.scala:46)
at org.apache.spark.sql.execution.datasources.text.TextFileFormat$$anonfun$readToUnsafeMem$1.apply(TextFileFormat.scala:127)
at org.apache.spark.sql.execution.datasources.text.TextFileFormat$$anonfun$readToUnsafeMem$1.apply(TextFileFormat.scala:124)
at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:148)
at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:132)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:128)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:182)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:109)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.agg_doAggregateWithKeys_0$(generated.java:568)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(generated.java:587)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2019-06-15 18:49:57 DEBUG TaskSchedulerImpl:58 - parentName: , name: TaskSet_0.0, runningTasks: 0
2019-06-15 18:49:57 WARN TaskSetManager:66 - Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): java.lang.NullPointerException
at org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:153)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:102)
at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.<init>(HadoopFileLinesReader.scala:46)
at org.apache.spark.sql.execution.datasources.text.TextFileFormat$$anonfun$readToUnsafeMem$1.apply(TextFileFormat.scala:127)
at org.apache.spark.sql.execution.datasources.text.TextFileFormat$$anonfun$readToUnsafeMem$1.apply(TextFileFormat.scala:124)
at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:148)
at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:132)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:128)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:182)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:109)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.agg_doAggregateWithKeys_0$(generated.java:568)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(generated.java:587)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

最佳答案

Spark 应用程序应使用 Spark-submit 启动。请参阅here

所以你应该做的是打包你的应用程序(jar)并使用spark-submit在spark上下文中运行该jar。

Sbt 要么检测到您的 Spark 应用程序(取决于您的 sbt 设置),要么只是将您的主要方法作为简单的 scala 应用程序运行(同样,取决于您的设置)。

无论如何,根据所提供的信息,我只能说这么多。请提供更多信息,以便得到更好的答案。

关于java - 仅在运行jar文件时发生空指针异常 - Scala Spark,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56604350/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com