gpt4 book ai didi

r - 不用循环填充值

转载 作者:行者123 更新时间:2023-12-04 09:15:53 24 4
gpt4 key购买 nike

我有一个大型数据框 x,其中包含特定 日期的股票价格。我想将这个数据集与一个日期变量合并,并在下一个特定日期之前填写 x 的最后一个已知观察结果,以便我得到数据框 z。下面的示例显示了一只股票的情况。

我正在使用循环,但过程非常缓慢,因为我有五到十年的日常数据和数千只股票。

还有其他方法吗?在 Matlab 中,相同的代码运行得更快。

重要的是我还可以使用替代条件而不是简单的 is.na(z[t,2]==TRUE 条件。

例子如下:

> x=data.frame(c("2015-05-31","2015-06-30","2015-07-31"),c(100,200,150))
> colnames(x)=c("Date","AAPL")
> x[,1]=as.Date(x[,1],origin="1970-01-01")
>
> x
Date AAPL
1 2015-05-31 100
2 2015-06-30 200
3 2015-07-31 150
>
> date=data.frame(c("2015-05-31","2015-06-01","2015-06-02","2015-06-03","2015-06-04","2015-06-05","2015-06-06","2015-06-07","2015-06-08","2015-06-09","2015-06-10","2015-06-11","2015-06-12","2015-06-13","2015-06-14","2015-06-15","2015-06-16","2015-06-17","2015-06-18","2015-06-19","2015-06-20","2015-06-21","2015-06-22","2015-06-23","2015-06-24","2015-06-25","2015-06-26","2015-06-27","2015-06-28","2015-06-29","2015-06-30","2015-07-01","2015-07-02","2015-07-03","2015-07-04","2015-07-05","2015-07-06","2015-07-07","2015-07-08","2015-07-09","2015-07-10","2015-07-11","2015-07-12","2015-07-13","2015-07-14","2015-07-15","2015-07-16","2015-07-17","2015-07-18","2015-07-19","2015-07-20","2015-07-21","2015-07-22","2015-07-23","2015-07-24","2015-07-25","2015-07-26","2015-07-27","2015-07-28","2015-07-29","2015-07-30","2015-07-31"))
> colnames(date)=c("Date")
> date[,1]=as.Date(date[,1],origin="1970-01-01")
>
> date
Date
1 2015-05-31
2 2015-06-01
3 2015-06-02
29 ...
30 2015-06-29
31 2015-06-30
32 2015-07-01
33 2015-07-02

>
> z=merge(x=x, y=date, by.x="Date", by.y="Date",all.y=TRUE)
>
>
> #Converting x to a data matrix speeds up the loop
> z=data.matrix(z)
>
> for (t in 1:nrow(z)) {
+ if (is.na(z[t,2]==TRUE)){
+ z[t,2]=z[t-1,2]
+ } else if (is.na(z[t,2]==TRUE)){
+ z[t,2]=z[t,2]
+ }
+ }
>
> z=as.data.frame(z)
> z[,1]=as.Date(z[,1],origin="1970-01-01")
>
> z
Date AAPL
1 2015-05-31 100
2 2015-06-01 100
3 2015-06-02 100
29 ...
30 2015-06-29 100
31 2015-06-30 200
32 2015-07-01 200
33 2015-07-02 200

最佳答案

使用 dplyrzoo 包对我有用:

library(dplyr)
library(zoo)

my_new_df <-
right_join(x, date) %>%
mutate(y = na.locf(AAPL))

head(my_new_df)

Date AAPL y
1 2015-05-31 100 100
2 2015-06-01 NA 100
3 2015-06-02 NA 100
4 2015-06-03 NA 100
5 2015-06-04 NA 100
6 2015-06-05 NA 100

tail(my_new_df)

Date AAPL y
57 2015-07-26 NA 200
58 2015-07-27 NA 200
59 2015-07-28 NA 200
60 2015-07-29 NA 200
61 2015-07-30 NA 200
62 2015-07-31 150 150

关于r - 不用循环填充值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32438481/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com