gpt4 book ai didi

google-bigquery - BigQuery 中 count(*) 的值是如何确定的?

转载 作者:行者123 更新时间:2023-12-04 19:03:04 25 4
gpt4 key购买 nike

我通过内部连接加入了一个大约 70000 行的表,第二个表稍大。现在 count(a.business_column) 和 count(*) 给出不同的结果。前者正确报告了 ~70000,而后者给出了 ~200000。但这仅在我单独选择 count(*) 时发生,当我一起选择它们时,它们会给出相同的结果(~70000)。这怎么可能?

select
count(*)
/*,count(a.business_column)*/

from table_a a
inner join each table_b b
on b.key_column = a.business_column

最佳答案

更新:有关其工作原理的逐步说明,请参阅 BigQuery flattens when using field with same name as repeated field反而。

回答标题问题:BigQuery 中的 COUNT(*) 总是准确的。

需要注意的是,在 SQL 中 COUNT(*) 和 COUNT(column) 在语义上具有不同的含义 - 并且可以以不同方式解释示例查询。

见:http://www.xaprb.com/blog/2009/04/08/the-dangerous-subtleties-of-left-join-and-count-in-sql/

他们有这个示例查询:

select user.userid, count(email.subject)
from user
inner join email on user.userid = email.userid
group by user.userid;

该查询结果不明确,文章作者将其更改为更明确的查询,并添加以下评论:

But what if that’s not what the author of the query meant? There’s no way to really know. There are several possible intended meanings for the query, and there are several different ways to write the query to express those meanings more clearly. But the original query is ambiguous, for a few reasons. And everyone who reads this query afterwards will end up guessing what the original author meant. “I think I can safely change this to…”



更新:有关其工作原理的逐步说明,请参阅 BigQuery flattens when using field with same name as repeated field反而。

关于google-bigquery - BigQuery 中 count(*) 的值是如何确定的?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32483766/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com