gpt4 book ai didi

hadoop - 使用spark中的SQL解析函数

转载 作者:可可西里 更新时间:2023-11-01 15:32:33 25 4
gpt4 key购买 nike

我有如下SQL

SELECT LIMIT, 
COL1,
COL2,
COL3
FROM
(SELECT ROW_NUMBER () OVER (ORDER BY COL5 DESC) AS LIMIT,
FROM_UNIXTIME(COL_DATETIME,'dd-MM-yyyy HH24:mi:ss') COL1,
CASE WHEN COL6 IN ('A', 'B') THEN A_NUMBER ELSE B_NUMBER END AS COL2,
COL3
FROM DBNAME.TABLENAME
WHERE COL7 LIKE ('123456%')
AND COL_DATETIME BETWEEN 20150201000000 AND 20150202235959) X

我可以从 hive 成功执行它。但我想从 Spark 中执行它。我创建了如下所示的 spark-sql-hive 上下文

scala> val sqlHContext = new org.apache.spark.sql.hive.HiveContext(sc)
sqlHContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@71138de5

然后我尝试像下面这样执行上面的sql查询

sqlHContext.sql("SELECT LIMIT, COL1, COL2, COL3 FROM (SELECT ROW_NUMBER () OVER (ORDER BY COL5 DESC) AS LIMIT, FROM_UNIXTIME(COL_DATETIME,'dd-MM-yyyy HH24:mi:ss') COL1, CASE WHEN COL6 IN ('A', 'B') THEN A_NUMBER ELSE B_NUMBER END AS COL2, COL3 FROM DBNAME.TABLENAME WHERE  COL7 LIKE ('123456%')  AND COL_DATETIME BETWEEN 20150201000000 AND 20150202235959) X").collect().foreach(println)

但是报错

org.apache.spark.sql.AnalysisException: 
Unsupported language features in query:


scala.NotImplementedError: No parse rules for ASTNode type: 882, text: TOK_WINDOWSPEC :
TOK_WINDOWSPEC 1, 90,98, 339
TOK_PARTITIONINGSPEC 1, 91,97, 339
TOK_ORDERBY 1, 91,97, 339
TOK_TABSORTCOLNAMEDESC 1, 95,97, 339
TOK_TABLE_OR_COL 1, 95,95, 339
CALL_DATETIME 1, 95,95, 339
" +

org.apache.spark.sql.hive.HiveQl$.nodeToExpr(HiveQl.scala:1261)

看起来不支持解析函数。我正在使用 spark 版本 1.3.0; hive 版本 1.1.0 和 hadoop 版本 2.7.0

还有其他方法可以通过 spark 实现吗?

最佳答案

从 Spark 1.4.0 开始支持窗口函数。仍然存在一些限制,例如尚不支持 ROWS BETWEEN。例如,请查看关于 Spark window functions 的博客文章.

关于hadoop - 使用spark中的SQL解析函数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30334615/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com