gpt4 book ai didi

sql - 使用 SAS 创建在特定日期拆分记录的表

转载 作者:行者123 更新时间:2023-12-02 01:49:21 30 4
gpt4 key购买 nike

我希望使用表 A 创建类似表 B 的内容,但基于表 C 中包含的一组任意拆分日期。

例如,(请注意,start_date = inception_date 并不总是正确的,因此必须保留 inception_date 而不是从 start_date 派生;这实际上代表了数百个属于该期间的字段)

enter image description here

我在 SAS 工作,但我希望能够使用 PROC SQL 编写此代码。我认为一种方法是为表 C 中的记录对(包括末尾的空值)创建多个表,然后将它们联合在一起。

伪代码示例:

for each record of table_c, concoct the pairs { (., 01-Jan-2012), (01-Jan-2012, 01-Jul-2012), (01-Jul-2012, 01-Jan-2013), (01-Jan-2013, .) }

以下查询可能需要围绕 split_date1split_date2 进行一些空测试:

CREATE TABLE subquery1 AS
SELECT
a.customer_id
,max(a.start_date, x.split_date1) AS start_date
,min(a.end_date, x.split_date2 - 1) AS end_date
,a.inception_date
FROM table_a AS a
JOIN split_date AS x
;
.... (do for each pair of split dates, and then union all these tables together with some WHERE querying to throw away the nonsensical rows) to produce table_b. The image above indicates which subquery would generate which rows in table_b

请帮我填补空白,或建议替代方法。

表_a:

customer_id start_date  end_date    inception_date      
aaa 18-Jun-11 17-Jun-12 18-Jun-11
aaa 18-Jun-12 17-Jun-13 18-Jun-12
bbb 13-Jul-11 12-Jul-12 13-Jul-11
ccc 14-May-11 13-Nov-11 14-Jul-11
ddd 21-Jun-11 20-Jun-12 21-Jun-11

表_b:

customer_id start_date  end_date    inception_date      subquery
aaa 18-Jun-11 31-Dec-11 18-Jun-11 (1)
aaa 01-Jan-12 17-Jun-12 18-Jun-11 (2)
aaa 18-Jun-12 30-Jun-12 18-Jun-12 (2)
aaa 01-Jul-12 31-Dec-12 18-Jun-12 (3)
aaa 01-Jan-13 17-Jun-13 18-Jun-12 (4)
bbb 13-Jul-11 31-Dec-11 13-Jul-11 (1)
bbb 01-Jan-12 30-Jun-12 13-Jul-11 (2)
bbb 01-Jul-12 12-Jul-12 13-Jul-11 (3)
ccc 14-May-11 13-Nov-11 14-May-11 (1)
ddd 21-Jun-11 31-Dec-11 21-Jun-11 (1)
ddd 01-Jan-12 20-Jun-12 21-Jun-11 (2)

表_c:

split_dates                 
01-Jan-12
01-Jul-12
01-Jan-13

最佳答案

这是一种混合 SQL/datastep 方法 - 但它更短!输入数据(取自@Joe给出的答案):-

data table_a;
informat start_date end_date date9.;
format start_date end_date date9.;
input customer_id $ start_date end_date;
datalines;
aaa 18JUN2011 17JUN2012
aaa 18JUN2012 17JUN2013
bbb 13JUL2011 12JUL2012
ccc 14MAY2011 13NOV2011
ddd 21JUN2011 20JUN2012
;;;;
run;

data table_c;
informat split_dates date9.;
format split_dates date9.;
input split_dates;
datalines;
01JAN2012
01JUL2012
01JAN2013
;;;;
run;

以下将拆分日期复制到一个宏变量(SQL!),然后使用该宏(datastep!)循环遍历 table_a:-

**  Output the split dates to a macro variable;
proc sql noprint;
select split_dates format=8. into: c_dates separated by ',' from table_c order by split_dates;
quit;

** For each period in table_a, look to see if each split date is within it,;
** outputting a row if so;
data final_out(drop=dt old_end_date);
set table_a(rename=(end_date = old_end_date));

format start_date end_date inception_date date11.;
inception_date = start_date;

do dt = &c_dates;

if start_date <= dt <= old_end_date then do;
end_date = dt - 1;
output;
start_date = dt;
end;

end;

** For the last row per table_a entry;
end_date = old_end_date;
output;
run;

如果您事先知道拆分日期,您可以将它们硬编码到数据步骤中并省略 SQL 位(不建议记住 - 硬编码很少是一个好主意)。

关于sql - 使用 SAS 创建在特定日期拆分记录的表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23814872/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com