gpt4 book ai didi

MySQL 累计产品分组依据

转载 作者:行者123 更新时间:2023-12-04 01:58:33 25 4
gpt4 key购买 nike

我一直在使用 WRDS/CRSP 数据集(由 UPenn 维护的用于学术研究的股票价格数据库)。我一直在用 Python 下载数据并将其插入到我的本地 MySQL 数据库中。

数据如下所示,主键为 (quote_date, security_id):

quote_date  security_id tr              accum_index
10-Jan-86 10002 null 1000
13-Jan-86 10002 -0.026595745 973.4042548
14-Jan-86 10002 0.005464481 978.7234036
15-Jan-86 10002 -0.016304348 962.7659569
16-Jan-86 10002 0 962.7659569
17-Jan-86 10002 0 962.7659569
20-Jan-86 10002 0 962.7659569
21-Jan-86 10002 0.005524862 968.0851061
22-Jan-86 10002 -0.005494506 962.765957
23-Jan-86 10002 0 962.765957
24-Jan-86 10002 -0.005524862 957.4468078
27-Jan-86 10002 0.005555556 962.7659569
28-Jan-86 10002 0 962.7659569
29-Jan-86 10002 0 962.7659569
30-Jan-86 10002 0 962.7659569
31-Jan-86 10002 0.027624309 989.3617013
3-Feb-86 10002 0.016129032 1005.319148
4-Feb-86 10002 0.042328041 1047.872338
5-Feb-86 10002 0.04568528 1095.744679

我需要计算 accum_index 列,它基本上是股票总返回的指数,计算如下:
accum_index_t = accum_index_{t-1} * (1 + tr_t)

该表有 80m 行。我已经写了一些代码来迭代每个 security_id 并计算累积乘积,如下所示:
select @sid := min(security_id)
from stock_prices;

create temporary table prices (
quote_date datetime,
security_id int,
tr double null,
accum_index double null,
PRIMARY KEY (quote_date, security_id)
);

while @sid is not null
do

select 'security_id', @sid;
select @accum := null;

insert into prices
select quote_date, security_id, tr, accum_index
from stock_prices
where security_id = @sid
order by quote_date asc;

update prices
set accum_index = (@accum := ifnull(@accum * (1 + tr), 1000.0));

update stock_prices p use index(PRIMARY), prices a use index(PRIMARY)
set p.accum_index = a.accum_index
where p.security_id = a.security_id
and p.quote_date = a.quote_date;

select @sid := min(security_id)
from stock_prices
where security_id > @sid;

delete from prices;

end while;

drop table prices;

但这太慢了,我的笔记本电脑上每个安全性大约需要一分钟,计算这个系列需要数年时间。有没有办法将其矢量化?

干杯,
史蒂夫

最佳答案

如果你使用 MySQL 8,你可以使用 window functions创建累积产品。不幸的是,没有PROD()我知道的任何 SQL 数据库中的聚合/窗口函数,但是 you can emulate it using EXP(SUM(LOG(factor))) :

SELECT
quote_date,
security_id,
tr,
1000 * (EXP(SUM(LOG(1 + COALESCE(tr, 1)))
OVER (PARTITION BY security_id ORDER BY quote_date)) - 1)
AS accum_index
FROM stock_prices
dbfiddle here .

关于MySQL 累计产品分组依据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48970227/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com