scala - 为什么 "java.lang.ClassNotFoundException: Failed to find data source: kinesis"具有 spark-streaming-kinesis-asl 依赖性？-6ren

scala - 为什么 "java.lang.ClassNotFoundException: Failed to find data source: kinesis"具有 spark-streaming-kinesis-asl 依赖性？

转载作者：行者123 更新时间：2023-12-05 06:29:24

26

4

我的设置:

  scala:2.11.8
  spark:2.3.0.cloudera4

我已经在我的 .pom 文件中添加了:

<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-streaming-kinesis-asl_2.11</artifactId>
  <version>2.3.0</version>
</dependency>

但是，当我运行我的 spark-streaming 代码以使用来自 kinesis 的数据时，它返回:

Exception in thread "main" java.lang.ClassNotFoundException: Failed to find data source: kinesis.

当我从 Kafka 消费数据时，我遇到了类似的错误，并通过在提交命令中指示依赖的 jar 来解决它。但这次似乎不起作用:

sudo -u hdfs spark2-submit --packages org.apache.spark:spark-streaming-kinesis-asl_2.11:2.3.0 --class com.package.newkinesis --master yarn  sparktest-1.0-SNAPSHOT.jar

如何解决这个问题？感谢您的帮助。

我的代码:

val spark = SparkSession
      .builder.master("local[4]")
      .appName("SpeedTester")
      .config("spark.driver.memory", "3g")
      .getOrCreate()

    val kinesis = spark.readStream
      .format("kinesis")
      .option("streamName", kinesisStreamName)
      .option("endpointUrl", kinesisEndpointUrl)
      .option("initialPosition", "TRIM_HORIZON")
      .option("awsAccessKey", awsAccessKeyId)
      .option("awsSecretKey", awsSecretKey)
      .load()

    kinesis.writeStream.format("console").start().awaitTermination()

我的完整 .pom 文件:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.netease</groupId>
  <artifactId>sparktest</artifactId>
  <version>1.0-SNAPSHOT</version>
  <inceptionYear>2008</inceptionYear>
  <properties>
    <scala.version>2.11.8</scala.version>
  </properties>
    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>3.2.1</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <includes>
                                <include>org/apache/spark/*</include>
                            </includes>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

  <dependencies>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.11</artifactId>
        <scope>provided</scope>
      <version>2.3.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming_2.11</artifactId>
        <scope>provided</scope>
      <version>2.3.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql_2.11</artifactId>
        <scope>provided</scope>
      <version>2.3.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
      <version>2.3.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.kafka</groupId>
      <artifactId>kafka-clients</artifactId>
      <version>2.1.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming-kinesis-asl_2.11</artifactId>
      <version>2.3.0</version>
    </dependency>
  </dependencies>
</project>

最佳答案

tl;dr 它不会起作用。

您将 spark-streaming-kinesis-asl_2.11 依赖项用于旧的 Spark Streaming API 和新的 Spark Structured Streaming，因此异常(exception)。

您必须为 AWS Kinesis 找到一个兼容的 Spark Structured Streaming 数据源，它不受 Apache Spark 项目的正式支持。

关于scala - 为什么 "java.lang.ClassNotFoundException: Failed to find data source: kinesis"具有 spark-streaming-kinesis-asl 依赖性？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53534395/

26

4

0

文章推荐： php - 使用 $wpdb get_results 的 WordPress 查询出错

文章推荐： salt-stack - 什么*是* salt 配方，真的吗？

文章推荐： hive - 我们可以预测 Hive SELECT * 查询结果的顺序吗？

c - C : *source++, (*source)++, *(source)++ 中的差异
这些指针之间有区别吗？每次通话到底发生了什么。 *p++ (*p)++, *(p)++ 最佳答案 1和3是一样的。请记住 ++ 的后缀和一元形式。和 --有一个结果和一个副作用: x++ 的结果是
linux - cat source.txt | cat source.txt 和有什么区别grep x 和 grep x source.txt？
这个问题已经有答案了: difference between grep Vs cat and grep (5 个回答) 已关闭 8 年前。我看到一个例子，其中有人这样做: cat source.tx
javascript - ES6 : No source code for webpack "cheap-module-eval-source-map" and "cheap-module-source-map" only ** WEBPACK FOOTER **
它曾经有效。现在，当我添加一个断点时: saveSnippet: (title, imageUrl, role) => { debugger; ... chrome (
.net - 错误记录: Source Error and Source File
开发.Net Web应用程序时，如果生成运行时错误，则会显示一些在Exception类中找不到的“额外”调试信息。它显示了“源错误”部分，其中显示了代码摘录，其中行号准确显示了错误的产生位置，并显示
html - "Source"和 "Generated Source"有什么区别？
Firefox 中的“源”和“生成的源”有什么区别？请举例说明。编辑: 7 月 3 日 “搜索引擎”使用哪个来源，生成的还是生成前的？最佳答案 Source 将显示页面加载的源(由服务器提供)。
Add Date from Source A to Source B(将日期从源A添加到源B)
对于具有两个不同工作表的Excel文件，我有两个OLE DB源。工作表A和工作表B。工作表A单元格I6包含日期，我想组合这两个源并在工作表B中添加一列，以将该值设置为工作表A的日期值。有可能做到吗？任
api - "destination, source"或 "source, destination"哪个更好？
就目前而言，这个问题不适合我们的问答形式。我们希望答案得到事实、引用资料或专业知识的支持，但这个问题可能会引发辩论、争论、投票或扩展讨论。如果您觉得这个问题可以改进并可能重新打开，visit the
python - 语音识别，断言错误 "Source must be an audio source"
这是我的代码: import speech_recognition as sr r = sr.Recognizer() with sr.Microphone() as source: prin
mysql - 语法错误: 'source' (source) is not valid input at this position
我是 mysql 新手。我正在尝试 setter 工示例数据库我尝试了 stackoverflow 中提到的一些方法，但没有帮助谁能告诉我如何解决这个问题 SELECT 'LOADING depa
python - Pycharm `source` 和 `source activate` 命令
在终端中，我启动程序如下: 1) source env.sh 2) source activate enviroment 3) program --args 除了在 Pycharm 中并调试代码之外，
java - IntelliJ 如何知道目录是 'source' 还是 'test source'？
IntelliJ 如何知道目录是“源”还是“测试源”？如何始终将目录标记为“测试源”？ build.gradle 1 apply plugin: 'java' apply plugin: 'idea'
r - 将脚本与 .GlobalEnv : Source script that source scripts 分开
这个问题类似于Source script to separate environment in R, not the global environment , 但有一个关键的转折。考虑一个源另一个脚
Webpack - devtool source-map VS eval-source-map
和有什么区别--devtool source-map & eval-source-map ? 最佳答案 webpack 文档有一个方便的图表，说明这些不同的选项可能适合哪些情况。他们显示eval-s
python - source env/bin/activate 'source' 未被识别为内部或外部命令、可运行程序或批处理文件
这个问题已经有答案了: Issue with virtualenv - cannot activate (36 个回答) 已关闭 4 年前。 venv) C:\Users\Sunil\PycharmP
scala - 如何从 Source[A] 创建 Akka Stream Source[Seq[A]]
在以前版本的 Akka Streams 中，groupBy 返回一个 Source 的 Source 可以具体化为一个 Source[Seq [A]]. 在 Akka Streams 2.4 中，我看
python - source env/bin/activate 'source' 未被识别为内部或外部命令、可运行程序或批处理文件
这个问题已经有答案了: Issue with virtualenv - cannot activate (36 个回答) 已关闭 4 年前。 venv) C:\Users\Sunil\PycharmP
bash - 一个 sourced bash 片段如何有条件地为 sourcing shell 提供一个功能？
是否可以获取 Bash 片段的源代码，但仅在特定条件成立时才实际提供其中的函数？所以我要问的是，我可以无条件地获取目录中的所有文件，但获取的文件包含是否向采购外壳提供功能的逻辑。例子: .bash
java - 未找到源 : Adding the source attachment to view source code
我无法查看 JavaCore.class 源代码，但我可以很好地使用代码。例如，要查看方法JavaCore.create(..) 的源代码，我ctrl - click(或按f3) 在 JavaCor
MySQL : syntax error : 'source' (source) is not valid input at this position
-- Sample employee database -- See changelog table for details -- Copyright (C) 2007,2008, MySQL
java - source 1.3(使用 -source 5 或更高版本启用泛型)
当我在我的 IDE 中编译项目时它工作正常但是当我在 bamboo 中编译时它给我以下错误。我已经检查过我在任务中配置的 jdk 版本是 1.6，我还尝试从 pom 中的 maven 插件强制执行

首页

博学

6Ren·AI

商城

scala - 为什么 "java.lang.ClassNotFoundException: Failed to find data source: kinesis"具有 spark-streaming-kinesis-asl 依赖性？