apache-spark - 如何在spark中读取orc事务hive表？-6ren

apache-spark - 如何在spark中读取orc事务hive表？

转载作者：行者123 更新时间：2023-12-02 01:01:27

如何在spark中读取orc事务hive表？
我在通过 spark 读取 ORC 事务表时遇到问题我获得了 hive 表的架构但无法读取实际数据
查看完整场景:

hive> create table default.Hello(id int,name string) clustered by
(id) into 2 buckets STORED AS ORC TBLPROPERTIES
('transactional'='true');
   
hive> insert into default.hello values(10,'abc');

现在我试图从 Spark sql 访问 Hive Orc 数据，但它显示
只有模式

>spark.sql("select * from  hello").show()

输出:id，名称

最佳答案

是的，作为一种解决方法，我们可以使用压缩，但是当工作是微批处理时，压缩将无济于事。所以我决定使用 JDBC 调用。请在下面的链接中引用我对此问题的回答或引用我的 GIT 页面 - https://github.com/Gowthamsb12/Spark/blob/master/Spark_ACID

Please refer my answer for this issue

关于apache-spark - 如何在spark中读取orc事务hive表？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50254590/

文章推荐： nlp - 如何在 SRILM 中实现基于监督类的语言模型？

文章推荐： r - 新创建的数据框丢失了其向量类别的标签

文章推荐： xml - XSL muenchian-多层次分组和嵌套

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

apache-spark - 如何在spark中读取orc事务hive表？