cassandra - 将 Spark SQL Hive Server 连接到 Cassandra？-6ren

cassandra - 将 Spark SQL Hive Server 连接到 Cassandra？

转载作者：行者123 更新时间：2023-12-04 16:08:58

26

4

所以我正在使用 Tableau、Spark 1.2 和 Cassandra 2.1.2。我已经成功地做了很多事情。

通过 https://github.com/datastax/spark-cassandra-connector 从 Spark shell 连接到 Cassandra 实例.

通过前面提到的连接器对 Cassandra 实例进行 SparkSQL 查询。

使用 Tableau(适用于 Cassandra 的最新 CQL3 兼容 Simba ODBC 驱动程序:http://www.simba.com/connectors/apache-cassandra-odbc)在 Cassandra 实例上运行查询和可视化。

在这一点上我的主要差距是，我如何正确配置 Spark 1.2 ThriftServer 以便能够与我的 Cassandra 实例通信？最终目标是通过 Tableau 运行 SparkSQL(需要 ThriftServer)。我能够毫无问题地启动 ThriftServer(大部分情况下)我可以像示例中那样运行直线并执行“显示表”调用。但是正如您在下面看到的，它会生成一个长度为 0 的表格列表。

beeline> !connect jdbc:hive2://192.168.56.115:10000
scan complete in 2ms
Connecting to jdbc:hive2://192.168.56.115:10000
Enter username for jdbc:hive2://192.168.56.115:10000: 
Enter password for jdbc:hive2://192.168.56.115:10000: 
log4j:WARN No appenders could be found for logger (org.apache.thrift.transport.TSaslTransport).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Connected to: Spark SQL (version 1.2.0)
Driver: null (version null)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://192.168.56.115:10000> show tables;
+---------+
| result  |
+---------+
+---------+
No rows selected (1.755 seconds)
0: jdbc:hive2://192.168.56.115:10000>

我需要 datastax 连接器吗？我必须假设答案是"is"。

即使我至少没有利用 Hive，我是否需要声明 hive-site.xml？

我可以在没有 Hive/Metastore 的情况下运行此设置吗？或者这是 Spark 1.2 中 ThriftServer 的要求？

假设我现有的 Spark Master/Worker 设置是正确的，但那里可能是错误的。

帮助! :)

最佳答案

您可以创建一个 global temporary view Cassandra 表，然后您将能够通过 JDBC 节俭服务器访问它。

val spark = SparkSession
    .builder()
    .enableHiveSupport()
    .getOrCreate()

val cassandraTable = spark.sqlContext
  .read
  .cassandraFormat("mytable", "mykeyspace", pushdownEnable = true)
  .load()

cassandraTable.createGlobalTempView("mytable")

spark.sqlContext.setConf("hive.server2.thrift.port", "10000")
HiveThriftServer2.startWithContext(spark.sqlContext)
System.out.println("Server is running")

关于cassandra - 将 Spark SQL Hive Server 连接到 Cassandra？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/28048277/

26

4

0

文章推荐： python-3.x - PyQt : How do you clear focus on startup?

文章推荐： azure-functions - Azure EventHubTrigger 函数应用中的偏移量

android - 接到 voip 电话时如何将 Activity 带到前台？
我正在开发一个 voip 调用应用程序。我需要做的是在接到来电时将 Activity 带到前台。我在应用程序中使用 Twilio，并在收到推送消息时开始调用。问题是我试图在接到任何电话时显示 Act

首页

博学

6Ren·AI

商城

cassandra - 将 Spark SQL Hive Server 连接到 Cassandra？