gpt4 book ai didi

r - 如何在 R 中使用并发值求和

转载 作者:行者123 更新时间:2023-12-04 01:05:10 24 4
gpt4 key购买 nike

下面是示例数据和我正在使用的代码。

  library(dplyr)
library(data.table)

firm <- c("firm1","firm2","firm3","firm4","firm5","firm6","firm7","firm8","firm9")
employment <- c(1,50,90,249,499,115,145,261,210)
small <- c(1,1,1,3,4,2,2,4,3)

smbtest <- data.frame(firm,employment,small)

smbsummary <- smbtest %>%
select (employment,small) %>%
group_by(small) %>%
summarise(employment = sum(employment), worksites = n())

想要的结果是这样的

   smb     employment    worksites
1 141 3
2 401 5
3 860 7
4 1620 9

smb 表示小型企业。 smb 的值遵循以下模式

  smb1     >= 0 and <100
smb2 >= 0 and <150
smb3 >= 0 and <250
smb4 >= 0 and <500

这就是我想要的

   smb     employment   worksites
1 141 3
2 260 2
3 459 2
4 760 2

所以问题是我如何将其设置为不分离出独特元素的地方。我希望它是累积的。如果有意义的话,0 到 100 是 1 到 150 的一部分。

最佳答案

我们只需要在最后添加一个mutate 步骤来获得累积总和('worksites' 的cumsum)。默认丢弃最后一组(因为只有一组,会自动丢弃,但指定.groups更好)

library(dplyr)
smbtest %>%
select(employment,small) %>%
group_by(small) %>%
summarise(employment = sum(employment), worksites = n(),
.groups = 'drop') %>%
mutate(worksites = cumsum(worksites))

-输出

# A tibble: 4 x 3
# small employment worksites
#* <dbl> <dbl> <int>
#1 1 141 3
#2 2 260 5
#3 3 459 7
#4 4 760 9

如果我们需要对多列进行累加求和,那么使用across

smbtest %>% 
select(employment,small) %>%
group_by(small) %>%
summarise(employment = sum(employment), worksites = n(),
.groups = 'drop') %>%
mutate(across(c(employment, worksites), cumsum))

或者collapse中的类似选项

library(collapse)
slt(smbtest, employment, small) %>%
fgroup_by(small) %>%
fsummarise(worksites = fNobs(employment),
employment = fsum(employment)) %>%
ftransformv(-1, cumsum)
# A tibble: 4 x 3
# small worksites employment
# <dbl> <int> <dbl>
#1 1 3 141
#2 2 5 401
#3 3 7 860
#4 4 9 1620

数据

smbtest <- tibble(firm, employment, small)

关于r - 如何在 R 中使用并发值求和,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66696227/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com