I'm trying to generate a numeric series that has a interval of a numeric 30, representing 30 minutes respectively.
我正在尝试生成一个间隔为数字30的数字序列,分别代表30分钟。
At the moment, my query is looking like this:
目前,我的查询如下所示:
WITH bins AS(
SELECT GENERATE_ARRAY(0,1470,30) AS low,
GENERATE_ARRAY(0,1500,30) AS top
), ride AS(
SELECT DATE_DIFF(ended_at,started_at,minute) as cicle
FROM `data-project-389314.Cyclist.july23_trip_filtered`
WHERE rideable_type != 'docked_bike'
)
SELECT low,
top,
cicle
FROM bins
LEFT JOIN ride
ON cicle >= low --Error line
AND cicle < top
GROUP BY low,
top
ORDER BY low
The error I keep on getting is:
我一直收到的错误是:
No matching signature for operator >= for argument types: INT64
, ARRAY<INT64>
. Supported signature: ANY >= ANY at [18:11]
I tried other codes which led me to recognize that Bigquery doesn't allow time date >= 24:00:00 but I kept getting error on that same variable always.
我尝试了其他代码,这些代码使我认识到BigQuery不允许Time Date>=24:00:00,但我总是在同一个变量上出错。
I appreciate any help.
我很感激你们的帮助。
更多回答
Hi - what is it that you don’t understand about that error message? You’re comparing a timestamp to an array which is a) not allowed and b) would be meaningless even if it was allowed. What are you actually trying to achieve?
您好-您对该错误消息有什么不理解的地方?您正在将时间戳与一个数组进行比较,该数组a)是不允许的,b)即使它是允许的,也是没有意义的。你到底想要达到什么目的?
What I tried to do is create a list where each row specify, beguining by the range, 00:00:00 to 01:00:00, and how many rides within this range has computed in the table. I assume the problem is because 'cicle' isnt aggregated, like lemon stated. I alredy updated my table because the colum I have - ride_lenght- surpass the time range 24:00:00 and it was in a string format!!
我尝试做的是创建一个列表,根据00:00:00到01:00:00的范围指定每一行的位置,并在表中计算该范围内的乘车次数。我想问题是因为‘卷发’并不像柠檬所说的那样是聚集的。我已经更新了我的表,因为我有的列-RIDE_LENGHTT-超过了24:00:00的时间范围,而且它是字符串格式的!!
优秀答案推荐
There are a couple of mistakes in your query:
您的查询中有几个错误:
- Your
GENERATE_ARRAY
function will generate an array, not a table. Hence when you attempt to join on matching values, you're really attempting to match each single value of "cicle" with the full arrays.
- In the final query, you're using a
GROUP BY
clause, yet you're not using aggregate functions. You could use the DISTINCT
to remove duplicate records, if that should feel necessary according to the nature of your data. Or if any aggregate function should be carried out, make sure to either aggregate "cicle" or add it to the GROUP BY
clause, to avoid subtle errors.
In order to fix these errors, you can:
要修复这些错误,您可以:
- change your first cte to unpack your arrays into tabular values, by cross-joining the application of
UNNEST
on the output of the GENERATE_ARRAY
function
- adding the "cicle" column inside the
GROUP BY
clause, to reflect the cicle partition.
Minor optional fixes:
次要的可选修复:
- using
BETWEEN ... AND ...
instead of two conditions inside the WHERE
clause
- changing the name "top" to "high" (looks more suited with respect to "low")
WITH bins AS(
SELECT *
FROM UNNEST(GENERATE_ARRAY(0,1470,30)) AS low,
UNNEST(GENERATE_ARRAY(0,1500,30)) AS high
), ride AS(
SELECT DATE_DIFF(ended_at,started_at,minute) as cicle
FROM `data-project-389314.Cyclist.july23_trip_filtered`
WHERE rideable_type != 'docked_bike'
)
SELECT DISTINCT low,
high,
cicle
FROM bins
LEFT JOIN ride
ON cicle BETWEEN low AND high
ORDER BY low
更多回答
WITH bins AS( SELECT * FROM UNNEST(GENERATE_ARRAY(0,1470,30)) AS low, UNNEST(GENERATE_ARRAY(0,1500,30)) AS high ), ride AS( SELECT count(rides_lenght) as cicle FROM data-project-389314.Cyclist.july23_trip_filtered
WHERE rideable_type != 'docked_bike' ) SELECT low, high, cicle FROM bins LEFT JOIN ride ON cicle BETWEEN low AND high GROUP BY low, high ORDER BY low
当垃圾桶为(SELECT*FROM UNNEST(GENERATE_ARRAY(0,1470,30))时,UNNEST(GENERATE_ARRAY(0,1500,30))为高),骑行方式(SELECT COUNT(RIDES_LENGHTT)AS CLICLE FROM data-project-389314.Cyclist.july23_trip_filtered WHERE RIDEABLE_TYPE!=‘DOKED_BIKE’)选择低、高、左侧垃圾桶中的自行车在低和高之间按低分组、按低顺序骑行
Using DISTINCT
is the correct way of writing GROUP BY low, high
in the solution you modified here. You should use GROUP BY
only in presence of aggregate functions.
在您在此处修改的解决方案中,使用DISTINCT是按低、高编写group的正确方式。仅当存在聚合函数时才应使用GROUP BY。
我是一名优秀的程序员,十分优秀!