gpt4 book ai didi

PostgreSQL 查询 : getting latest forecast before a deadline, 与实际比较

转载 作者:行者123 更新时间:2023-11-29 13:51:24 25 4
gpt4 key购买 nike

我想查找按小时和天分割的历史实际风和预报风。

我对一天中的某个时间有多个预报。我的第二天交易的交易截止日期为东部标准时间上午 10 点,因此我希望在此之前的最新预测与该小时的实际风在同一条线上。

使事情复杂化的是时间戳是格林威治标准时间,比美国东部标准时间早 5 小时。

   WITH
forecast_prep AS (
SELECT
date_trunc('day', (foretime - interval '5 hours')) :: DATE AS Foredate,
extract(HOUR FROM (foretime - interval '5 hours')) + 1 AS foreHE,
lat,
lon,
max(windspeed) as forecast,
max(as_of) - interval '5 hours' AS as_of
FROM weather.forecast
WHERE date_trunc('day', foretime) :: DATE - as_of >= INTERVAL '9 hours'
GROUP BY Foredate, foreHE, lat, lon
),
tmp AS (
SELECT
meso.station,
meso.lat,
meso.lon,
(meso.timestmp - interval '5 hours') as timestmp,
date_trunc('day', (meso.timestmp - interval '5 hours')) :: DATE AS Date,
extract(HOUR FROM (meso.timestmp - interval '5 hours')) + 1 AS HE,
CAST(AVG(meso.windspd) AS NUMERIC(19, 2)) AS Actual
FROM weather.meso
GROUP BY station, lat, lon, timestmp, Date, HE
)
SELECT
tmp.station, tmp.Date, tmp.HE, tmp.Actual, forecast_prep.forecast, forecast_prep.as_of
FROM tmp
INNER JOIN forecast_prep ON (
tmp.lat = forecast_prep.lat
AND tmp.lon = forecast_prep.lon
AND tmp.Date = forecast_prep.Foredate
AND tmp.HE = forecast_prep.foreHE
)
WHERE
(tmp.timestmp BETWEEN '2016-02-01' AND '2016-02-02')
AND (tmp.station = 'KSBN')
GROUP BY
tmp.station, tmp.Date, tmp.HE, forecast_prep.forecast, forecast_prep.as_of, tmp.Actual
ORDER BY tmp.Date, tmp.HE ASC;

下面是带有相关示例数据的完整表结构。

CREATE SCHEMA weather
CREATE TABLE weather.forecast
(
foretime timestamp without time zone NOT NULL,
as_of timestamp without time zone NOT NULL, -- in UTC
summary text,
precipintensity numeric(8,4),
precipprob numeric(2,2),
temperature numeric(5,2),
apptemp numeric(5,2),
dewpoint numeric(5,2),
humidity numeric(2,2),
windspeed numeric(5,2),
windbearing numeric(4,1),
visibility numeric(5,2),
cloudcover numeric(4,2),
pressure numeric(6,2),
ozone numeric(5,2),
preciptype text,
lat numeric(8,6) NOT NULL,
lon numeric(9,6) NOT NULL,
CONSTRAINT forecast_pkey PRIMARY KEY (foretime, as_of, lat, lon)
);

INSERT INTO weather.forecast
(windspeed, foretime, as_of, lat, lon)
VALUES
(11.19, '2/1/2016 8:00', '1/30/2016 23:00', 34.556, 28.345),
(10.98, '2/1/2016 8:00', '1/31/2016 5:00', 34.556, 28.345),
(10.64, '2/1/2016 8:00', '1/31/2016 11:00', 34.556, 28.345),
(10.95, '2/1/2016 8:00', '1/31/2016 17:00', 34.556, 28.345),
(10.39, '2/1/2016 8:00', '1/31/2016 23:00', 34.556, 28.345),
(9.22, '2/1/2016 8:00', '2/1/2016 5:00', 34.556, 28.345),
(10, '2/1/2016 9:00', '1/30/2016 11:00', 34.556, 28.345),
(9.88, '2/1/2016 9:00', '1/30/2016 17:00', 34.556, 28.345),
(10.79, '2/1/2016 9:00', '1/30/2016 23:00', 34.556, 28.345),
(10.8, '2/1/2016 9:00', '1/31/2016 5:00', 34.556, 28.345),
(10.35, '2/1/2016 9:00', '1/31/2016 11:00', 34.556, 28.345),
(10.07, '2/1/2016 9:00', '1/31/2016 17:00', 34.556, 28.345),
(9.57, '2/1/2016 9:00', '1/31/2016 23:00', 34.556, 28.345),
(7.93, '2/1/2016 9:00', '2/1/2016 5:00', 34.556, 28.345)
;

CREATE TABLE weather.meso
(
timestmp timestamp without time zone NOT NULL,
station text NOT NULL,
lat numeric NOT NULL,
lon numeric NOT NULL,
tmp numeric,
hum numeric,
windspd numeric,
winddir integer,
dew numeric,
CONSTRAINT meso_pkey PRIMARY KEY (timestmp, station, lat, lon)
);
INSERT INTO weather.meso
(station, timestmp, lat, lon, windspd)
VALUES
('KSBN', '2/1/2016 8:02', 34.556, 28.345, 16.1),
('KSBN', '2/1/2016 8:12', 34.556, 28.345, 12.6),
('KSBN', '2/1/2016 8:54', 34.556, 28.345, 11.5),
('KSBN', '2/1/2016 9:02', 34.556, 28.345, 18.1),
('KSBN', '2/1/2016 9:17', 34.556, 28.345, 12.2),
('KSBN', '2/1/2016 9:48', 34.556, 28.345, 11.5)
;

这是我想要的输出格式:

station   date       he  actual forecast   as_of  
KSBN 2/1/2016 4 10.4 15.1 1/31/2016 6:00
KSBN 2/1/2016 5 12.7 11.32 1/31/2016 6:00

最佳答案

DDL 和示例数据确实有助于理解,但我所能提出的只是关于如何利用 row_number 的更多详细信息,例如,也可在此处在线获得 http://rextester.com/FIEUPI83002

select
row_number() OVER(PARTITION BY date_trunc('day', (foretime - interval '5 hours')) :: DATE
ORDER BY case when extract(HOUR FROM (foretime - interval '5 hours')) < 10 then 1 else 2 end, AS_OF desc) AS rn
, extract(HOUR FROM (foretime - interval '5 hours')) HR
, foretime
, as_of
from forecast
order by RN, as_of DESC

根据可用的示例数据,其结果如下:

+----+----+-----------+---------------------+---------------------+
| | rn | date_part | foretime | as_of |
+----+----+-----------+---------------------+---------------------+
| 1 | 1 | 4 | 01.02.2016 09:00:00 | 01.02.2016 05:00:00 |
| 2 | 2 | 3 | 01.02.2016 08:00:00 | 01.02.2016 05:00:00 |
| 3 | 3 | 4 | 01.02.2016 09:00:00 | 31.01.2016 23:00:00 |
| 4 | 4 | 3 | 01.02.2016 08:00:00 | 31.01.2016 23:00:00 |
| 5 | 5 | 4 | 01.02.2016 09:00:00 | 31.01.2016 17:00:00 |
| 6 | 6 | 3 | 01.02.2016 08:00:00 | 31.01.2016 17:00:00 |
| 7 | 7 | 4 | 01.02.2016 09:00:00 | 31.01.2016 11:00:00 |
| 8 | 8 | 3 | 01.02.2016 08:00:00 | 31.01.2016 11:00:00 |
| 9 | 9 | 3 | 01.02.2016 08:00:00 | 31.01.2016 05:00:00 |
| 10 | 10 | 4 | 01.02.2016 09:00:00 | 31.01.2016 05:00:00 |
| 11 | 11 | 3 | 01.02.2016 08:00:00 | 30.01.2016 23:00:00 |
| 12 | 12 | 4 | 01.02.2016 09:00:00 | 30.01.2016 23:00:00 |
| 13 | 13 | 4 | 01.02.2016 09:00:00 | 30.01.2016 17:00:00 |
| 14 | 14 | 4 | 01.02.2016 09:00:00 | 30.01.2016 11:00:00 |
+----+----+-----------+---------------------+---------------------+

因此,如果您要使用过滤器 WHERE RN = 1,则应列出每天的“最近”行,即 10 之前的行。我相信这样的事情会适合您的要求。请注意,使用 case 表达式和对 row_number 序列进行排序的其他列(在 OVER() 子句内)调整列组合以满足您的需要。


下方为原创评论

在没有示例数据的情况下,我将只讨论一种方法;我建议使用 ROW_NUMBER() OVER(按 date_time_column DESC 排序) 例如

select
*
from (
select *
, ROW_NUMBER() OVER(ORDER BY timestmp DESC) AS RN
from forecast_table
-- where timestmp < 10 am (include required logic ere)
)
WHERE RN = 1

由于 DESCendng 顺序,计算列 RN 中值为 1 的行将是最新的行。这也可以与 PARTITION BY 结合使用,因此 row_numebr 方法对于查找“最新”行或“最旧”行甚至每个分区或整体的最大/最小行很有用。

关于PostgreSQL 查询 : getting latest forecast before a deadline, 与实际比较,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40687950/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com