gpt4 book ai didi

cassandra - CQL (cassandra) - 仅选择其中一列中具有最大值的行

转载 作者:行者123 更新时间:2023-12-04 03:50:39 25 4
gpt4 key购买 nike

我需要找到具有给定 stationid、time1 大于指定时间和最大 time2 的行。

表是这样创建的:

CREATE TABLE forec (
stationid int,
time1 timestamp,
time2 timestamp,
value double,
PRIMARY KEY ((stationid), time1, time2)
) WITH CLUSTERING ORDER BY (time1 DESC)

假设表中的数据是这样的:

    +------------+-----------------------+----------------------+--------+
| stationid | time1 | time2 | value |
+------------+-----------------------+----------------------+--------+
| 1 | 2020-10-21 06:00:00 | 2020-10-21 05:00:00 | 1 |
| 1 | 2020-10-21 06:00:00 | 2020-10-21 04:00:00 | 2 |
| 1 | 2020-10-21 06:00:00 | 2020-10-21 03:00:00 | 3 |
| 1 | 2020-10-21 05:00:00 | 2020-10-21 04:00:00 | 4 |
| 1 | 2020-10-21 05:00:00 | 2020-10-21 03:00:00 | 5 |
| 1 | 2020-10-21 04:00:00 | 2020-10-21 02:00:00 | 6 |
+------------+-----------------------+----------------------+--------+

我想查询:给我所有 stationid = 1 和 time1 >= 2020-10-21 05:00:00 且 time2 具有最大值的行。查询应返回以下行:

    +------------+-----------------------+----------------------+--------+
| stationid | time1 | time2 | value |
+------------+-----------------------+----------------------+--------+
| 1 | 2020-10-21 06:00:00 | 2020-10-21 05:00:00 | 1 |
| 1 | 2020-10-21 05:00:00 | 2020-10-21 04:00:00 | 4 |
+------------+-----------------------+----------------------+--------+

我知道我可以这样查询:

SELECT * FROM forec WHERE stationid = 1 AND time1 >= '2020-10-21 05:00:00';

然后在客户端过滤结果(并且只保留具有最大时间的行2),但是我想知道这是否可以更有效地完成(在 Cassandra 端过滤结果)。

或者也许我应该更改表格模型?

最佳答案

使用 UDA/UDF 的解决方案:

状态函数:

CREATE OR REPLACE FUNCTION curValState ( state tuple<timestamp,double>, time timestamp, value double ) CALLED ON NULL INPUT RETURNS tuple<timestamp, double> LANGUAGE java AS 'if (time != null && value != null) { if(state == null) {com.datastax.driver.core.TupleType tupleType = com.datastax.driver.core.TupleType.of(com.datastax.driver.core.ProtocolVersion.NEWEST_SUPPORTED, com.datastax.driver.core.CodecRegistry.DEFAULT_INSTANCE, com.datastax.driver.core.DataType.timestamp(), com.datastax.driver.core.DataType.cdouble()); state = tupleType.newValue(time, value);} else {if(state.getTimestamp(0).compareTo(time)<0){ state.setTimestamp(0, time); state.setDouble(1, value);}}} return state;';

最终函数:

CREATE OR REPLACE FUNCTION finalVal ( state tuple<timestamp, double> ) CALLED ON NULL INPUT RETURNS double LANGUAGE java AS 'return state.getDouble(1);';

聚合函数:

CREATE OR REPLACE AGGREGATE valueatlatesttime (timestamp, double) SFUNC curValState STYPE tuple<timestamp, double> FINALFUNC finalVal INITCOND null;

查询:

SELECT
stationid,
time1,
max(time2) AS max_time2,
valueatlatesttime(time2, value) AS value_at_max_time2
FROM
forec
WHERE
stationid = 1
AND
time1 >= '2020-10-21 05:00:00'
GROUP BY time1;

关于cassandra - CQL (cassandra) - 仅选择其中一列中具有最大值的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64460377/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com