gpt4 book ai didi

r - 使用 dplyr 在 R 中进行动态 CAGR 计算

转载 作者:行者123 更新时间:2023-12-04 03:01:16 25 4
gpt4 key购买 nike

我有以下数据:

 Company    Year    Variables    Data
ABC 2000 Revenue 10
ABC 2001 Revenue 15
ABC 2002 Revenue 12
ABC 2003 Revenue 25
ABC 2004 Revenue 30
CDE 2000 Revenue 5
CDE 2001 Revenue 8
CDE 2002 Revenue 17
CDE 2003 Revenue 9
CDE 2004 Revenue 34

#etc

我想计算过去 3 年的复合年增长率 (CAGR)。

例如,每家公司的 3 年复合年增长率结果将是:
Company    Year    Variables    Data    CAGR
ABC 2000 Revenue 10 NA
ABC 2001 Revenue 15 NA
ABC 2002 Revenue 12 6.27%
ABC 2003 Revenue 25 18.56%
ABC 2004 Revenue 30 35.72%
CDE 2000 Revenue 5 NA
CDE 2001 Revenue 8 NA
CDE 2002 Revenue 17 50.37%
CDE 2003 Revenue 9 4.00%
CDE 2004 Revenue 34 25.99%

我按年份在数据中使用以下公式:
CAGR for 2004=((LastYear/PreviousYear)^(1/n))-1
For example for n = 2
LastYear =2004
PreviousYear =2004-2 = 2002

尝试计算 2004 年与 2002 年复合年增长率的 R 代码:
library(tibble)
library(dplyr)
library(lubridate)

year<-c(rep(2000:2004,2))
company<-rep(c("ABC","CDE"),5)
variable<-rep("revenue",10)
data<-c(10,15,12,25,30,5,8,17,9,34)

tibdf<-tibble(company,year,variable,data)
View(tibdf)

#revenue2004<-tibdf%>%filter(year==2004)%>%select(company,data)
#revenue2002<-tibdf%>%filter(year==2001)%>%select(company,data)

计算 CAGR(来自 Plot Compound Annual Growth Rate (3 independent variables) in R)
annual.growth.rate <- function(a){

T1 <- max(a$year) - min(a$year)+1
FV <- a[which(a$year == max(a$year)),"data"]
SV <- a[which(a$year == min(a$year)),"data"]
cagr <- ((FV/SV)^(1/T1)) -1

}

将 tibdf 用于 in 函数。
不幸的是,我无法将函数应用于我的数据。

感谢你的帮助。

最佳答案

这是一种方法:

library(tidyverse)
df %>%
arrange(Company, Year) %>% #in case the years are not in order (here they are)
group_by(Company) %>%
mutate(lagY = lag(Year), #get the lag year
lagD = lag(Data), #get lad Data
t = Year - lagY, #calculate time
CAGR = (Data / lagD)^(1/t) - 1) %>% #calculate CAGR
select(-lagY, -lagD, -t) #remove unwanted variables


#output:
Company Year Variables Data CAGR
<fct> <int> <fct> <int> <dbl>
1 ABC 2000 Revenue 10 NA
2 ABC 2001 Revenue 15 0.500
3 ABC 2002 Revenue 12 - 0.200
4 ABC 2003 Revenue 25 1.08
5 ABC 2004 Revenue 30 0.200
6 CDE 2000 Revenue 5 NA
7 CDE 2001 Revenue 8 0.600
8 CDE 2002 Revenue 17 1.12
9 CDE 2003 Revenue 9 - 0.471
10 CDE 2004 Revenue 34 2.78

或者在不制作中间变量的情况下更密集一点:
   df %>%
arrange(Company, Year) %>%
group_by(Company) %>%
mutate(CAGR = (Data/lag(Data))^(1/(Year-lag(Year))) - 1)

数据:
df <- read.table(text ="Company    Year    Variables    Data
ABC 2000 Revenue 10
ABC 2001 Revenue 15
ABC 2002 Revenue 12
ABC 2003 Revenue 25
ABC 2004 Revenue 30
CDE 2000 Revenue 5
CDE 2001 Revenue 8
CDE 2002 Revenue 17
CDE 2003 Revenue 9
CDE 2004 Revenue 34", header = T)

关于r - 使用 dplyr 在 R 中进行动态 CAGR 计算,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48984116/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com