gpt4 book ai didi

sql - oracle 跟踪带有时间戳列的表上的历史记录

转载 作者:行者123 更新时间:2023-12-03 20:50:43 28 4
gpt4 key购买 nike

我在 oracle 12c 数据库中有 2 个表,其结构如下。表 A 包含来自具有修改日期时间戳的应用程序的传入数据,
每天我们可能会在表 A 中获得大约 50,000 行。目标是使用表 A 的数据并插入最终目标表 B(通常有数十亿行)
以表A的数据作为行车数据集。
仅当传入的数据集属性发生更改时,才需要在表 B 中插入/合并记录。
基本上目的是仅在其属性(例如状态和 zip_cd)发生更改时,才使用有效时间戳跟踪给定产品的历史/旅程。
见下表结构

Table A  ( PRODUCT_ID, STATE, ZIP_CD, Modified_dt)

'abc', 'MN', '123', '3/5/2020 12:01:00 AM'
'abc', 'MN', '123', '3/5/2020 6:01:13 PM'
'abc', 'IL', '223', '3/5/2020 7:01:15 PM'
'abc', 'OH', '333', '3/5/2020 6:01:16 PM'
'abc', 'NY', '722', '3/5/2020 4:29:00 PM'
'abc', 'KS', '444', '3/5/2020 4:31:41 PM'
'bbc', 'MN', '123', '3/19/2020 2:47:08 PM'
'bbc', 'IL', '223', '3/19/2020 2:50:37 PM'
'ccb', 'MN', '123', '3/21/2020 2:56:24 PM'
'dbd', 'KS', '444', '6/20/2020 12:00:00 AM'

Target Table B (SEQUENCE_KEY,PRODUCT_ID,STATE, ZIP_CD, Valid_From, Valid_To, LATEST_FLAG)
'1', 'abc', 'AR', '999', '3/3/2020 12:00:00 AM', '3/3/2020 6:01:13 PM', 'N'
'2', 'abc', 'AR', '555', '3/3/2020 6:01:14 PM', '3/3/2020 6:01:14 PM', 'N'
'3', 'abc', 'CA', '565', '3/3/2020 6:01:15 PM', '3/4/2020 4:28:59 PM', 'N'
'4', 'abc', 'CA', '777', '3/4/2020 4:29:00 PM', '12/31/2099', 'Y'
'5', 'bbc', 'MN', '123', '3/4/2020 4:31:41 PM', '3/19/2020 2:47:07 PM', 'N'
'6', 'bbc', 'MN', '666', '3/18/2020 2:47:08 PM', '3/19/2020 2:50:36 PM', 'N'
'7', 'bbc', 'MN', '777', '3/18/2020 2:50:37 PM', '12/31/2099', , 'Y'
'8', 'ccb', 'MN', '123', '3/20/2020 2:56:24 PM', '12/31/2099', 'Y'

将数据填充到表 B 中的规则:
  • 输出表上的主键是 product_id 和 valid_from 字段。
    来自表 A 的传入数据的修改 dt 时间戳将始终大于现有表。
  • 为了插入数据,我们必须比较目标表 B 中的 latest_flag = 'Y' 记录和表 A 中的传入数据,并且仅在发生更改时
    在属性 state 和 zip_cd 中,那么需要从表 A 的表 B 中插入一条记录。 valid_to 列是一个计算字段,它总是比
    下一行从日期开始有效,最新行默认为“12/31/2099”。类似地,latest_flag 列是一个计算列,表示给定 product_id 的当前行
  • 在传入的数据集中,如果与前一行或表 B(latest_flag='Y') 中的现有数据相比,有多行没有任何变化,则
    这些也应该被忽略。例如,表 A 中的第 2 行和第 9 行被忽略,因为与该产品的先前行相比,属性状态 zip_cd 没有变化。

  • 根据上述规则,我需要将表 A 的数据合并到表 B 中,最终输出如下所示
    Table B  (SEQUENCE_KEY,PRODUCT_ID,STATE, ZIP_CD, Valid_From, Valid_To,  LATEST_FLAG)
    '1', 'abc', 'AR', '999', '3/3/2020 12:00:00 AM', '3/3/2020 6:01:13 PM', 'N'
    '2', 'abc', 'AR', '555', '3/3/2020 6:01:14 PM' '3/3/2020 6:01:14 PM', 'N'
    '3', 'abc', 'CA', '565', '3/3/2020 6:01:15 PM' '3/4/2020 4:28:59 PM', 'N'
    '4', 'abc', 'CA', '777', '3/4/2020 4:29:00 PM' '3/5/2020 12:00:00 AM', 'N'
    '5', 'abc', 'MN', '123', '3/5/2020 12:01:00 AM', '3/5/2020 7:01:14 PM', 'N'
    '6', 'abc', 'IL', '223' '3/5/2020 7:01:15 PM', '3/5/2020 6:01:15 PM', 'N'
    '7', 'abc', 'OH', '333', '3/5/2020 6:01:16 PM', '3/5/2020 4:28:59 PM', 'N'
    '8', 'abc', 'NY', '722', '3/5/2020 4:29:00 PM', '3/5/2020 4:31:40 PM', 'N'
    '9', 'abc', 'KS', '444', '3/5/2020 4:31:41 PM', '12/31/2099', 'Y'
    '10', 'bbc', 'MN', '123', '3/4/2020 4:31:41 PM' '3/19/2020 2:47:07 PM', 'N'
    '11', 'bbc', 'MN', '666', '3/18/2020 2:47:08 PM' '3/19/2020 2:50:36 PM', 'N'
    '12', 'bbc', 'MN', '777', '3/18/2020 2:50:37 PM' '3/19/2020 2:47:07 PM', 'N'
    '13', 'bbc', 'MN', '123', '3/19/2020 2:47:08 PM' '3/19/2020 2:50:36 PM', 'N'
    '14', 'bbc', 'IL', '223', '3/19/2020 2:50:37 PM' '12/31/2099', 'Y'
    '15', 'ccb', 'MN', '123', '3/20/2020 2:56:24 PM' '12/31/2099', 'Y'
    '16', 'dbd', 'KS', '444', '6/20/2020 12:00:00 AM' '12/31/2099', 'Y'
    寻找解决此问题的建议。
    实时 SQL 链接:
    https://livesql.oracle.com/apex/livesql/s/kfbx7dwzr3zz28v6eigv0ars0
    谢谢你。

    最佳答案

    我试图了解如何在 SQL 中执行此操作,但由于逻辑以及您在所需输出中具有的 sequence_key 重置,这对我来说是不可能的。
    所以,这里是我在 PL/SQL 中的建议

    SQL> select * from table_a ;

    PRODUCT_ID STATE ZIP_CD MODIFIED_
    ------------------------------ ------------------------------ ------------------------------ ---------
    abc MN 123 05-MAR-20
    abc MN 123 05-MAR-20
    abc IL 223 05-MAR-20
    abc OH 333 05-MAR-20
    abc NY 722 05-MAR-20
    abc KS 444 05-MAR-20
    bbc MN 123 19-MAR-20
    bbc IL 223 19-MAR-20
    ccb MN 123 19-MAR-20
    dbd KS 444 19-MAR-20

    10 rows selected.

    SQL> select * from table_b ;

    SEQUENCE_KEY PRODUCT_ID STATE ZIP_CD VALID_FRO VALID_TO L
    ------------ ------------------------------ ------------------------------ ------------------------------ --------- --------- -
    1 abc AR 999 05-MAR-20 05-MAR-20 N
    2 abc AR 555 05-MAR-20 05-MAR-20 N
    3 abc CA 565 05-MAR-20 05-MAR-20 N
    4 abc CA 777 05-MAR-20 31-DEC-99 Y
    5 bbc MN 123 05-MAR-20 05-MAR-20 N
    6 bbc MN 666 05-MAR-20 05-MAR-20 N
    7 bbc MN 777 19-MAR-20 31-DEC-99 Y
    8 ccb MN 123 19-MAR-20 31-DEC-99 Y

    8 rows selected.
    现在,我使用了这段 PL_SQL 代码
    declare 
    type typ_rec_set IS RECORD
    (
    PRODUCT_ID VARCHAR2(30 CHAR),
    STATE VARCHAR2(30 CHAR),
    ZIP_CD VARCHAR2(30 CHAR),
    VALID_FROM DATE ,
    VALID_TO DATE ,
    LATEST_FLAG VARCHAR2(1 CHAR)
    );
    type typ_rec_tab is TABLE OF typ_rec_set;
    l_hdr_tab typ_rec_tab;
    begin
    SELECT product_id
    ,state
    ,zip_cd
    ,valid_from
    ,valid_to
    ,CASE WHEN valid_to = DATE '2099-12-31' THEN 'Y' ELSE 'N' END latest_flag
    BULK COLLECT INTO l_hdr_tab
    FROM
    (
    SELECT a.product_id
    ,a.state
    ,a.zip_cd
    ,a.modified_dt valid_from
    ,NVL(((LEAD (a.modified_dt,1) OVER (PARTITION BY a.product_id ORDER BY a.modified_dt)) - INTERVAL '1' SECOND),DATE '2099-12-31' )valid_to
    ,CASE
    WHEN ( ( b.product_id IS NOT NULL
    AND a.state != b.state
    AND a.zip_cd != b.zip_cd)
    OR b.product_id IS NULL
    ) THEN
    1
    ELSE
    0
    END insert_flag
    FROM table_a a
    LEFT OUTER JOIN table_b b
    ON a.product_id = b.product_id
    AND b.latest_flag = 'Y'
    WHERE (a.modified_dt >= b.valid_from OR b.product_id IS NULL)
    ORDER BY a.product_id,a.modified_dt
    )
    WHERE insert_flag != 0 ;
    --loop
    FOR i IN l_hdr_tab.first .. l_hdr_tab.last
    LOOP
    -- begin block
    begin
    insert into table_b
    (
    sequence_key ,
    PRODUCT_ID ,
    STATE ,
    ZIP_CD ,
    VALID_FROM ,
    VALID_TO ,
    LATEST_FLAG
    )
    values
    (
    ( select max(sequence_key)+1 from table_b ),
    l_hdr_tab(i).product_id ,
    l_hdr_tab(i).state ,
    l_hdr_tab(i).zip_cd ,
    l_hdr_tab(i).valid_from ,
    l_hdr_tab(i).valid_to ,
    l_hdr_tab(i).latest_flag
    );
    end;
    end loop;-- reset sequence base of row_number over product_id valid_from
    commit;
    -- reset sequence
    merge into table_b t
    using ( select sequence_key ,
    PRODUCT_ID ,
    STATE ,
    ZIP_CD ,
    VALID_FROM ,
    VALID_TO ,
    LATEST_FLAG ,
    row_number() over ( order by product_id,valid_from ) as new_seq
    from table_b ) s
    on ( s.rowid = t.rowid )
    when matched then
    update set t.sequence_key = s.new_seq where t.sequence_key != s.new_seq ;
    commit;
    exception when others then raise;
    end;
    /

    然后我运行它
    SQL> host cat proc.sql
    declare
    type typ_rec_set IS RECORD
    (
    PRODUCT_ID VARCHAR2(30 CHAR),
    STATE VARCHAR2(30 CHAR),
    ZIP_CD VARCHAR2(30 CHAR),
    VALID_FROM DATE ,
    VALID_TO DATE ,
    LATEST_FLAG VARCHAR2(1 CHAR)
    );
    type typ_rec_tab is TABLE OF typ_rec_set;
    l_hdr_tab typ_rec_tab;
    begin
    SELECT product_id
    ,state
    ,zip_cd
    ,valid_from
    ,valid_to
    ,CASE WHEN valid_to = DATE '2099-12-31' THEN 'Y' ELSE 'N' END latest_flag
    BULK COLLECT INTO l_hdr_tab
    FROM
    (
    SELECT a.product_id
    ,a.state
    ,a.zip_cd
    ,a.modified_dt valid_from
    ,NVL(((LEAD (a.modified_dt,1) OVER (PARTITION BY a.product_id ORDER BY a.modified_dt)) - INTERVAL '1' SECOND),DATE '2099-12-31' )valid_to
    ,CASE
    WHEN ( ( b.product_id IS NOT NULL
    AND a.state != b.state
    AND a.zip_cd != b.zip_cd)
    OR b.product_id IS NULL
    ) THEN
    1
    ELSE
    0
    END insert_flag
    FROM table_a a
    LEFT OUTER JOIN table_b b
    ON a.product_id = b.product_id
    AND b.latest_flag = 'Y'
    WHERE (a.modified_dt >= b.valid_from OR b.product_id IS NULL)
    ORDER BY a.product_id,a.modified_dt
    )
    WHERE insert_flag != 0 ;
    --loop
    FOR i IN l_hdr_tab.first .. l_hdr_tab.last
    LOOP
    -- begin block
    begin
    insert into table_b
    (
    sequence_key ,
    PRODUCT_ID ,
    STATE ,
    ZIP_CD ,
    VALID_FROM ,
    VALID_TO ,
    LATEST_FLAG
    )
    values
    (
    ( select max(sequence_key)+1 from table_b ),
    l_hdr_tab(i).product_id ,
    l_hdr_tab(i).state ,
    l_hdr_tab(i).zip_cd ,
    l_hdr_tab(i).valid_from ,
    l_hdr_tab(i).valid_to ,
    l_hdr_tab(i).latest_flag
    );
    end;
    end loop;-- reset sequence base of row_number over product_id valid_from
    commit;
    -- reset sequence
    merge into table_b t
    using ( select sequence_key ,
    PRODUCT_ID ,
    STATE ,
    ZIP_CD ,
    VALID_FROM ,
    VALID_TO ,
    LATEST_FLAG ,
    row_number() over ( order by product_id,valid_from ) as new_seq
    from table_b ) s
    on ( s.rowid = t.rowid )
    when matched then
    update set t.sequence_key = s.new_seq where t.sequence_key != s.new_seq ;
    commit;
    exception when others then raise;
    end;
    /

    SQL> @proc.sql

    PL/SQL procedure successfully completed.

    SQL> select * from table_b order by sequence_key ;

    SEQUENCE_KEY PRODUCT_ID STATE ZIP_CD VALID_FRO VALID_TO L
    ------------ ------------------------------ ------------------------------ ------------------------------ --------- --------- -
    1 abc AR 999 05-MAR-20 05-MAR-20 N
    2 abc NY 722 05-MAR-20 05-MAR-20 N
    3 abc CA 777 05-MAR-20 31-DEC-99 Y
    4 abc KS 444 05-MAR-20 05-MAR-20 N
    5 abc MN 123 05-MAR-20 05-MAR-20 N
    6 abc AR 555 05-MAR-20 05-MAR-20 N
    7 abc CA 565 05-MAR-20 05-MAR-20 N
    8 abc OH 333 05-MAR-20 05-MAR-20 N
    9 abc IL 223 05-MAR-20 31-DEC-99 Y
    10 bbc MN 666 05-MAR-20 05-MAR-20 N
    11 bbc MN 123 05-MAR-20 05-MAR-20 N

    SEQUENCE_KEY PRODUCT_ID STATE ZIP_CD VALID_FRO VALID_TO L
    ------------ ------------------------------ ------------------------------ ------------------------------ --------- --------- -
    12 bbc MN 777 19-MAR-20 31-DEC-99 Y
    13 bbc IL 223 19-MAR-20 31-DEC-99 Y
    14 ccb MN 123 19-MAR-20 31-DEC-99 Y
    15 dbd KS 444 19-MAR-20 31-DEC-99 Y

    15 rows selected.

    SQL>
    只要让我知道您可能有任何疑问。我知道我肯定会想念一些东西 ;)
    更新
    我意识到我在循环中有一个无用的操作,即计算字段 SEQUENCE_KEY 的最大值。我在这里有一个更好的程序版本:
    declare
    type typ_rec_set IS RECORD
    (
    PRODUCT_ID VARCHAR2(30 CHAR),
    STATE VARCHAR2(30 CHAR),
    ZIP_CD VARCHAR2(30 CHAR),
    VALID_FROM DATE ,
    VALID_TO DATE ,
    LATEST_FLAG VARCHAR2(1 CHAR)
    );
    type typ_rec_tab is TABLE OF typ_rec_set;
    l_hdr_tab typ_rec_tab;
    r pls_integer := 1;
    vseq pls_integer;
    begin
    -- calculate value sequence
    select max(sequence_key) into vseq from table_b ;
    SELECT product_id
    ,state
    ,zip_cd
    ,valid_from
    ,valid_to
    ,CASE WHEN valid_to = DATE '2099-12-31' THEN 'Y' ELSE 'N' END latest_flag
    BULK COLLECT INTO l_hdr_tab
    FROM
    (
    SELECT a.product_id
    ,a.state
    ,a.zip_cd
    ,a.modified_dt valid_from
    ,NVL(((LEAD (a.modified_dt,1) OVER (PARTITION BY a.product_id ORDER BY a.modified_dt)) - INTERVAL '1' SECOND),DATE '2099-12-31' )valid_to
    ,CASE
    WHEN ( ( b.product_id IS NOT NULL
    AND a.state != b.state
    AND a.zip_cd != b.zip_cd)
    OR b.product_id IS NULL
    ) THEN
    1
    ELSE
    0
    END insert_flag
    FROM table_a a
    LEFT OUTER JOIN table_b b
    ON a.product_id = b.product_id
    AND b.latest_flag = 'Y'
    WHERE (a.modified_dt >= b.valid_from OR b.product_id IS NULL)
    ORDER BY a.product_id,a.modified_dt
    )
    WHERE insert_flag != 0 ;
    --loop
    FOR i IN l_hdr_tab.first .. l_hdr_tab.last
    LOOP
    -- begin block
    vseq := vseq + r ;
    begin
    insert into table_b
    (
    sequence_key ,
    PRODUCT_ID ,
    STATE ,
    ZIP_CD ,
    VALID_FROM ,
    VALID_TO ,
    LATEST_FLAG
    )
    values
    (
    vseq ,
    l_hdr_tab(i).product_id ,
    l_hdr_tab(i).state ,
    l_hdr_tab(i).zip_cd ,
    l_hdr_tab(i).valid_from ,
    l_hdr_tab(i).valid_to ,
    l_hdr_tab(i).latest_flag
    );
    end;
    r := r + 1;
    end loop;-- reset sequence base of row_number over product_id valid_from
    commit;
    -- reset sequence
    merge into table_b t
    using ( select sequence_key ,
    PRODUCT_ID ,
    STATE ,
    ZIP_CD ,
    VALID_FROM ,
    VALID_TO ,
    LATEST_FLAG ,
    row_number() over ( order by product_id,valid_from ) as new_seq
    from table_b ) s
    on ( s.rowid = t.rowid )
    when matched then
    update set t.sequence_key = s.new_seq where t.sequence_key != s.new_seq ;
    commit;
    exception when others then raise;
    end;
    /

    关于sql - oracle 跟踪带有时间戳列的表上的历史记录,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63038654/

    28 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com