作者热门文章
- android - RelativeLayout 背景可绘制重叠内容
- android - 如何链接 cpufeatures lib 以获取 native android 库?
- java - OnItemClickListener 不起作用,但 OnLongItemClickListener 在自定义 ListView 中起作用
- java - Android 文件转字符串
在 MySQL 中,我可以有这样的查询:
select
cast(from_unixtime(t.time, '%Y-%m-%d %H:00') as datetime) as timeHour
, ...
from
some_table t
group by
timeHour, ...
order by
timeHour, ...
GROUP BY
中的 timeHour
是选择表达式的结果。
但是我刚刚尝试了一个类似于Sqark SQL
的查询,我得到了一个错误
Error: org.apache.spark.sql.AnalysisException:
cannot resolve '`timeHour`' given input columns: ...
我对 Spark SQL
的查询是这样的:
select
cast(t.unixTime as timestamp) as timeHour
, ...
from
another_table as t
group by
timeHour, ...
order by
timeHour, ...
在 Spark SQL
中可以使用这种构造吗?
最佳答案
Is this construct possible in Spark SQL?
是的,是的。您可以通过两种方式使其在 Spark SQL 中工作,以在 GROUP BY
和 ORDER BY
子句中使用新列
方法 1 使用子查询:
SELECT timeHour, someThing FROM (SELECT
from_unixtime((starttime/1000)) AS timeHour
, sum(...) AS someThing
, starttime
FROM
some_table)
WHERE
starttime >= 1000*unix_timestamp('2017-09-16 00:00:00')
AND starttime <= 1000*unix_timestamp('2017-09-16 04:00:00')
GROUP BY
timeHour
ORDER BY
timeHour
LIMIT 10;
方法 2 使用 WITH//优雅的方式:
-- create alias
WITH table_aliase AS(SELECT
from_unixtime((starttime/1000)) AS timeHour
, sum(...) AS someThing
, starttime
FROM
some_table)
-- use the same alias as table
SELECT timeHour, someThing FROM table_aliase
WHERE
starttime >= 1000*unix_timestamp('2017-09-16 00:00:00')
AND starttime <= 1000*unix_timestamp('2017-09-16 04:00:00')
GROUP BY
timeHour
ORDER BY
timeHour
LIMIT 10;
在 Scala 中使用 Spark DataFrame(wo SQL)API 的替代方法:
// This code may need additional import to work well
val df = .... //load the actual table as df
import org.apache.spark.sql.functions._
df.withColumn("timeHour", from_unixtime($"starttime"/1000))
.groupBy($"timeHour")
.agg(sum("...").as("someThing"))
.orderBy($"timeHour")
.show()
//another way - as per eliasah comment
df.groupBy(from_unixtime($"starttime"/1000).as("timeHour"))
.agg(sum("...").as("someThing"))
.orderBy($"timeHour")
.show()
关于mysql - 在 "GROUP BY"子句中重用 select 表达式的结果?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46395333/
我是一名优秀的程序员,十分优秀!