sql - Postgres : How to find nearest tsrange from timestamp outside of ranges?-6ren

sql - Postgres : How to find nearest tsrange from timestamp outside of ranges?

转载作者：行者123 更新时间：2023-11-29 12:10:01

我正在为供应商提供的本地服务建模(在 Postgres 9.6.1/postGIS 2.3.1 中):

create table supplier (
    id                serial primary key,
    name              text not null check (char_length(title) < 280),
    type              service_type,
    duration          interval,
    ...
    geo_position      geography(POINT,4326)
    ...
);

每个供应商都有一个日历，其中包含可以预订的时间段:

create table timeslot (
    id                 serial primary key,
    supplier_id        integer not null references supplier(id),
    slot               tstzrange not null,

    constraint supplier_overlapping_timeslot_not_allowed
    exclude using gist (supplier_id with =, slot with &&)
);

当客户想知道附近有哪些供应商可以在特定时间预订时，我创建了一个 View 和一个函数:

create view supplier_slots as
    select
        supplier.name, supplier.type, supplier.geo_position, supplier.duration, ...
        timeslot.slot
    from
        supplier, timeslot
    where
        supplier.id = timeslot.supplier_id;


create function find_suppliers(wantedType service_type, near_latitude text, near_longitude text, at_time timestamptz)
returns setof supplier_slots as $$
declare
    nearpoint geography;
begin
    nearpoint := ST_GeographyFromText('SRID=4326;POINT(' || near_latitude || ' ' || near_longitude || ')');
    return query
        select * from supplier_slots
        where type = wantedType
            and tstzrange(at_time, at_time + duration) <@ slot
        order by ST_Distance( nearpoint, geo_position )
        limit 100;
end;
$$ language plpgsql;

所有这些都非常有效。

现在，对于在请求的时间没有可预订时间段的供应商，我想找到他们在请求的at_time 之前和之后的最近可用时间段>，也按距离排序。

这让我有点头晕，我找不到任何合适的运算符来给我最近的 tsrange。

关于最聪明的方法有什么想法吗？

最佳答案

解决方案取决于您对所需内容的确切定义。

架构

我建议使用这些稍微调整过的表定义来简化任务、加强完整性并提高性能:

CREATE TABLE supplier (
   supplier_id  serial PRIMARY KEY,
   supplier     text NOT NULL CHECK (length(title) < 280),
   type         service_type,
   duration     interval,
   geo_position geography(POINT,4326)
);

CREATE TABLE timeslot (
   timeslot_id  serial PRIMARY KEY,
   supplier_id  integer NOT NULL -- references supplier(id),
   slot_a       timestamptz NOT NULL,
   slot_z       timestamptz NOT NULL,
   CONSTRAINT   timeslot_range_valid CHECK (slot_a < slot_z)
   CONSTRAINT   timeslot_no_overlapping
     EXCLUDE USING gist (supplier_id WITH =, tstzrange(slot_a, slot_z) WITH &&)
);

CREATE INDEX timeslot_slot_z ON timeslot (supplier_id, slot_z);
CREATE INDEX supplier_geo_position_gist ON supplier USING gist (geo_position);

保存两个 timestamptz 列 slot_a 和 slot_z 而不是 tstzrange 列 slot - 并相应地调整约束。现在，这会将所有范围自动视为默认包含下限和不包含上限 - 这避免了角落案例错误/麻烦。
附带好处:2 timestamptz 仅 16 个字节，而不是 tstzrange 的 25 个字节(32 个带填充)。
您可能在 slot 上遇到的所有查询都将继续使用 tstzrange(slot_a, slot_z) 作为直接替换。
在 (supplier_id, slot_z) 上为手头的查询添加索引。
以及关于 supplier.geo_position 的空间索引(您可能已经有了)。
根据 type 中的数据分布，查询中常见类型的几个部分索引可能有助于提高性能:
```
CREATE INDEX supplier_geo_type_foo_gist ON supplier USING gist (geo_position)
WHERE supplier = 'foo'::service_type;
```

查询/函数

此查询找到 X 个提供正确 service_type 的最近供应商(示例中为 100 个)，每个供应商都有一个最接近的匹配时间时隙(由到时隙开始的时间距离定义)。我将其与实际匹配的插槽相结合，这可能是也可能不是您需要的。

CREATE FUNCTION f_suppliers_nearby(_type service_type, _lat text, _lon text, at_time timestamptz)
  RETURNS TABLE (supplier_id  int
               , name         text
               , duration     interval
               , geo_position geography(POINT,4326)
               , distance     float 
               , timeslot_id  int
               , slot_a       timestamptz
               , slot_z       timestamptz
               , time_dist    interval
   ) AS
$func$
   WITH sup_nearby AS (  -- find matching or later slot
      SELECT s.id, s.name, s.duration, s.geo_position
           , ST_Distance(ST_GeographyFromText('SRID=4326;POINT(' || _lat || ' ' || _lon || ')')
                          , geo_position) AS distance
           , t.timeslot_id, t.slot_a, t.slot_z
           , CASE WHEN t.slot_a IS NOT NULL
                  THEN GREATEST(t.slot_a - at_time, interval '0') END AS time_dist
      FROM   supplier s
      LEFT   JOIN LATERAL (
         SELECT *
         FROM   timeslot
         WHERE  supplier_id = supplier_id
         AND    slot_z > at_time + s.duration  -- excl. upper bound
         ORDER  BY slot_z
         LIMIT  1
         ) t ON true
      WHERE  s.type = _type
      ORDER  BY s.distance
      LIMIT  100
      )
   SELECT *
   FROM  (
      SELECT DISTINCT ON (supplier_id) *  -- 1 slot per supplier
      FROM  (
         TABLE sup_nearby  -- matching or later slot

         UNION ALL         -- earlier slot
         SELECT s.id, s.name, s.duration, s.geo_position
              , s.distance
              , t.timeslot_id, t.slot_a, t.slot_z
              , GREATEST(at_time - t.slot_a, interval '0') AS time_dist
         FROM   sup_nearby s
         CROSS  JOIN LATERAL (  -- this time CROSS JOIN!
            SELECT *
            FROM   timeslot
            WHERE  supplier_id = s.supplier_id
            AND    slot_z <= at_time  -- excl. upper bound
            ORDER  BY slot_z DESC
            LIMIT  1
            ) t
         WHERE  s.time_dist IS DISTINCT FROM interval '0'  -- exact matches are done
         ) sub
      ORDER  BY supplier_id, time_dist  -- pick temporally closest slot per supplier
   ) sub
   ORDER  BY time_dist, distance;  -- matches first, ordered by distance; then misses, ordered by time distance

$func$  LANGUAGE sql;

我没有使用您的 View supplier_slots 而是针对性能进行了优化。 View 可能仍然很方便。为了向后兼容，您可以包含 tstzrange(slot_a, slot_z) AS slot。

查找 100 个最接近的供应商的基本查询是教科书“K 最近邻”问题。 GiST 索引对此很有效。相关:

How do I query all rows within a 5-mile radius of my coordinates?

附加任务(找到时间上最近的插槽)可以分为两个任务:找到下一个更高的行和下一个下一个更低的行。该解决方案的核心特征是两个子查询ORDER BY slot_z LIMIT 1和ORDER BY slot_z DESC LIMIT 1，这会导致两次非常快速的索引扫描。

我将第一个与查找实际匹配相结合，这是一个(我认为很聪明的)优化，但可能会分散实际解决方案的注意力。

关于sql - Postgres : How to find nearest tsrange from timestamp outside of ranges?，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41208541/

文章推荐： sql - 使用 Postgres 在 json 数组中索引对象元素

文章推荐： sql - 为什么需要这个 UNIQUE 约束？

文章推荐： ios - 当我尝试获取 Realm 0.96.2 时 pod install 不起作用

文章推荐：从具有空值的值列表更新时出现 postgresql 数据类型错误

timestamp - KnexJS : How do you insert/update a timestamp field with current timestamp?
标题基本上说明了一切。我主要对更新案例感兴趣。假设我们正在尝试更新具有时间戳记字段的记录，并且我们希望将该字段设置为记录更新的时间戳记。有没有办法做到这一点？最佳答案经过一些实验，我找到了合适的
python - 'Timestamp' 对象没有属性 'timestamp'
我正在学习一门类(class)，其中我必须将日期转换为 unix 时间戳。 import pandas as pd df = pd.read_csv('file.csv') print type(df
sql - TIMESTAMP、TIMESTAMP with TIME ZONE 和 TIMESTAMP with LOCAL TIME ZONE 之间的区别
我在两个不同的数据库中运行了相同的语句:我的本地数据库和 Oracle Live SQL . CREATE TABLE test( timestamp TIMESTAMP DEFAULT SY
sql - TIMESTAMP、TIMESTAMP with TIME ZONE 和 TIMESTAMP with LOCAL TIME ZONE 之间的区别
我在两个不同的数据库中运行了相同的语句:我的本地数据库和 Oracle Live SQL . CREATE TABLE test( timestamp TIMESTAMP DEFAULT SY
python - bson.timestamp.Timestamp - 递增计数器是什么？
bson.timestamp.Timestamp需要两个参数:time 和 inc。 time 显然是存储在 Timestamp 中的时间值。什么是公司？它被描述为递增计数器，但它有什么用途呢？它应
php - 查询 where timestamp < timestamp 不起作用？
2016-08-18 04:52:14 是我从数据库中获取的时间戳，用于跟踪我想从哪里加载更多记录，这些记录小于该时间这是代码 foreach($explode as $stat){
timestamp - 如何转换Erlang :timestamp() to normal date format?
我想将 erlang:timestamp() 的结果转换为正常的日期类型，公历类型。普通日期类型表示“日-月-年，时:分:秒”。 ExampleTime = erlang:timeStamp(),
timestamp - 如何转换Erlang :timestamp() to normal date format?
我想将 erlang:timestamp() 的结果转换为正常的日期类型，公历类型。普通日期类型表示“日-月-年，时:分:秒”。 ExampleTime = erlang:timeStamp(),
java - 将 Timestamp 与另一个 Timestamp 对象进行比较
我是 Java 新手。我正在使用两个 Timestamp 对象 dateFrom和dateTo 。我想检查是否dateFrom比 dateTo早 45 天。我用这个代码片段来比较这个 if(dateF
python - 属性错误 : 'Timestamp' object has no attribute 'timestamp
在将 panda 对象转换为时间戳时，我遇到了这个奇怪的问题。 Train['date'] 值类似于 01/05/2014，我正在尝试将其转换为 linuxtimestamp。我的代码: Train
python - 属性错误 : 'Timestamp' object has no attribute 'timestamp'
我正在努力让我的代码运行。时间戳似乎有问题。您对我如何更改代码有什么建议吗？我看到之前有人问过这个问题，但没能成功。这是我在运行代码时遇到的错误:'Timestamp' object has no
timestamp - AWS 雅典娜 SYNTAX_ERROR : not a valid timestamp literal
我正在尝试运行以下查询: SELECT startDate FROM tests WHERE startDate BETWEEN TIMESTAMP '1555248497'
sql - 亚马逊雅典娜 : Convert bigint timestamp to readable timestamp
我正在使用 Athena 查询以 bigInt 格式存储的日期。我想将其转换为友好的时间戳。我试过了: from_unixtime(timestamp DIV 1000) AS readab
sql-server - SQLServer异常: The conversion from timestamp to TIMESTAMP is unsupported.
最近进行了一些数据库更改，并且 hibernate 映射出现了一些困惑。 hibernate 映射: ...other fields 成员模型对象: public class Mem
Pandas : How to get timestamp. 天和 timestamp.month 填充零
rng = pd.date_range('2016-02-07', periods=7, freq='D') print(rng[0].day) print(rng[0].month) 7 2 我想要
Pandas : How to get timestamp. 天和 timestamp.month 填充零
rng = pd.date_range('2016-02-07', periods=7, freq='D') print(rng[0].day) print(rng[0].month) 7 2 我想要
Android - Firebase ServerValue.TIMESTAMP 返回 "{.sv=timestamp}"
我必须在我的数据库中保存 ServerValue.TIMESTAMP 但它必须是一个字符串。当我键入 String.valueOf(ServerValue.TIMESTAMP); 或 ServerVa
PostgreSQL select now()::timestamp 不同于默认的 now()::timestamp
在我的程序中，每个表都有一列 last_modified: last_modified int8 DEFAULT (date_part('epoch'::text, now()::timestamp)
python - 将 pandas._libs.tslibs.timestamps.Timestamp 转换为日期时间
我想将此时间戳对象转换为日期时间此对象是在数据帧上使用 asfreq 后获得的这是最后一个索引 Timestamp('2018-12-01 00:00:00', freq='MS') 想要的输出 2
mysql - 如何查找上一条记录[n-per-group max(timestamp) < timestamp]？
我有一个包含时间序列传感器数据的大表。大型是指分布在被监控的各个 channel 中的从几千到 10M 的记录。对于某种传感器类型，我需要计算当前读数和上一个读数之间的时间间隔，即找到当前读数之前的最

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

sql - Postgres : How to find nearest tsrange from timestamp outside of ranges?

架构

查询/函数