gpt4 book ai didi

java - 卡夫卡流无法在 Spark 作业中工作

转载 作者:行者123 更新时间:2023-11-30 02:44:30 25 4
gpt4 key购买 nike

我编写了代码来从“topicTest1”Kafka 队列获取数据。我无法打印消费者的数据。发生错误并在下面提到,

下面是我使用数据的代码,

public static void main(String[] args) throws Exception {

// StreamingExamples.setStreamingLogLevels();
SparkConf sparkConf = new SparkConf().setAppName("JavaKafkaWordCount").setMaster("local[*]");
;
// Create the context with 2 seconds batch size
JavaStreamingContext jssc = new JavaStreamingContext(sparkConf, new Duration(100));

int numThreads = Integer.parseInt("3");
Map<String, Integer> topicMap = new HashMap<>();
String[] topics = "topicTest1".split(",");
for (String topic : topics) {
topicMap.put(topic, numThreads);
}

JavaPairReceiverInputDStream<String, String> messages = KafkaUtils.createStream(jssc, "9.98.171.226:9092", "1",
topicMap);

messages.print();
jssc.start();
jssc.awaitTermination();
}

使用以下依赖项

<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.6.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.10</artifactId>
<version>1.6.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>1.6.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-twitter_2.11</artifactId>
<version>1.6.1</version>
</dependency>

我遇到以下错误

 Exception in thread "dispatcher-event-loop-0" java.lang.NoSuchMethodError: scala/Predef$.$conforms()Lscala/Predef$$less$colon$less; (loaded from file:/C:/Users/Administrator/.m2/repository/org/scala-lang/scala-library/2.10.5/scala-library-2.10.5.jar by sun.misc.Launcher$AppClassLoader@4b69b358) called from class org.apache.spark.streaming.scheduler.ReceiverSchedulingPolicy (loaded from file:/C:/Users/Administrator/.m2/repository/org/apache/spark/spark-streaming_2.11/1.6.2/spark-streaming_2.11-1.6.2.jar by sun.misc.Launcher$AppClassLoader@4b69b358).
at org.apache.spark.streaming.scheduler.ReceiverSchedulingPolicy.scheduleReceivers(ReceiverSchedulingPolicy.scala:138)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$receive$1.applyOrElse(ReceiverTracker.scala:450)
at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:116)16/11/14 13:38:00 INFO ForEachDStream: metadataCleanupDelay = -1

at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1153)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.lang.Thread.run(Thread.java:785)

另一个错误

Exception in thread "JobGenerator" java.lang.NoSuchMethodError: scala/Predef$.$conforms()Lscala/Predef$$less$colon$less; (loaded from file:/C:/Users/Administrator/.m2/repository/org/scala-lang/scala-library/2.10.5/scala-library-2.10.5.jar by sun.misc.Launcher$AppClassLoader@4b69b358) called from class org.apache.spark.streaming.scheduler.ReceivedBlockTracker (loaded from file:/C:/Users/Administrator/.m2/repository/org/apache/spark/spark-streaming_2.11/1.6.2/spark-streaming_2.11-1.6.2.jar by sun.misc.Launcher$AppClassLoader@4b69b358).
at org.apache.spark.streaming.scheduler.ReceivedBlockTracker.allocateBlocksToBatch(ReceivedBlockTracker.scala:114)
at org.apache.spark.streaming.scheduler.ReceiverTracker.allocateBlocksToBatch(ReceiverTracker.scala:203)
at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:247)
at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:246)
at scala.util.Try$.apply(Try.scala:161)
at org.apache.spark.streaming.scheduler.JobGenerator.generateJobs(JobGenerator.scala:246)
at org.apache.spark.streaming.scheduler.JobGenerator.org$apache$spark$streaming$scheduler$JobGenerator$$processEvent(JobGenerator.scala:181)
at org.apache.spark.streaming.scheduler.JobGenerator$$anon$1.onReceive(JobGenerator.scala:87)
at org.apache.spark.streaming.scheduler.JobGenerator$$anon$1.onReceive(JobGenerator.scala:86)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

最佳答案

确保您使用正确的版本。假设您使用以下 Maven 依赖项:

    <dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.10</artifactId>
<version>1.6.1</version>
</dependency>

因此工件等于:spark-streaming-kafka_2.10

现在检查您是否使用正确的 Kafka 版本:

cd /KAFKA_HOME/libs

现在找到 kafka_YOUR-VERSION-sources.jar。

如果你有kafka_2.10-0xxxx-sources.jar,那就没问题了! :)如果您使用不同的版本,只需更改maven依赖项或下载正确的kafka版本。

之后检查您的 Spark 版本。确保您使用正确的版本

groupId:org.apache.spark
工件ID:spark-core_2.xx
版本:xxx

关于java - 卡夫卡流无法在 Spark 作业中工作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40595662/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com