apache-spark-sql - SparkSQL - 相关标量子查询只能包含相等谓词-6ren

apache-spark-sql - SparkSQL - 相关标量子查询只能包含相等谓词

转载作者：行者123 更新时间：2023-12-05 04:11:56

25

4

我想用 Spark SQL 2.0 执行以下查询

SELECT
a.id as id,
(SELECT SUM(b.points) 
  FROM tableB b 
  WHERE b.id = a.id AND b.date <= a.date) AS points
FROM tableA a

但是我得到以下错误

相关标量子查询只能包含相等谓词。

知道如何重写查询或使用两个数据帧 tableA 和 tableB 之间的操作来使其正常工作吗？

最佳答案

select a.id as id, 
sum(b.points) as points 
from a, b 
where a.id = b.id 
and b.date <= a.date 
group by a.id 
;

跳过 sub-select 和 group by id，确保 ids 和 b's points 列的总和之间是一对一的关系。

这是我使用的“肮脏”示例:

select * from a ;

id|date
1|2017-01-22 17:59:49
2|2017-01-22 18:00:00
3|2017-01-22 18:00:05
4|2017-01-22 18:00:11
5|2017-01-22 18:00:15

select * from b ;
id|points|date
1|12|2017-01-21 18:03:20
3|25|2017-01-21 18:03:37
5|17|2017-01-21 18:03:55
2|-1|2017-01-22 18:04:27
4|-4|2017-01-22 18:04:35
5|400|2017-01-20 18:17:31
5|-1000|2017-01-23 18:18:36

注意 b 有三个 id = 5 的条目，两个在 a.date 之前，一个在 a.date 之后。

select a.id, sum(b.points) as points from a, b where a.id = b.id and b.date <= a.date group by a.id ;
1|12
3|25
5|417

我还确认支持“分组依据”:http://spark.apache.org/docs/latest/sql-programming-guide.html#supported-hive-features

关于apache-spark-sql - SparkSQL - 相关标量子查询只能包含相等谓词，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41631199/

25

4

0

文章推荐： nginx - 强制 nginx 写一个空的/空白的 http 授权请求 header

文章推荐： octave - Octave 中符号表达式的求值

perl - 只能 'perl6'解析Perl 6吗？
有一条(相对)众所周知的 Perl 公理:“只有 Perl 可以解析 Perl”。我想知道 Perl 6 是否仍然如此？扩大讨论...考虑到 PyPy 最近的更新，我想到了这个问题。 Perl 独特
javascript - 为什么我*只能*访问 setInterval 内的对象属性？
这是设置。在上一个问题中，我发现我可以通过子组件中的状态传递对象属性，然后使用 componentDidUpdate 获取该对象属性。在这种情况下，状态和属性都称为到达。这是基本代码... expo
java - 为什么 IntelliJ 只能 'compile' 某些资源？
我运行的是 10.5.2 社区版。我已经标记了源/主要/资源作为源目录。我可以右键单击并“编译”某些文件，据我所知，这意味着 IDE 将文件复制到与发送类文件的“com.mydomain.pack

首页

博学

6Ren·AI

商城

apache-spark-sql - SparkSQL - 相关标量子查询只能包含相等谓词