gpt4 book ai didi

r - 分组并计数以获得接近的价格

转载 作者:行者123 更新时间:2023-12-04 03:06:25 24 4
gpt4 key购买 nike

我想按country计算statusopen的次数,以及statusclosed的次数。然后计算每个closeratecountry

数据:

customer <- c(1,2,3,4,5,6,7,8,9)
country <- c('BE', 'NL', 'NL','NL','BE','NL','BE','BE','NL')
closeday <- c('2017-08-23', '2017-08-05', '2017-08-22', '2017-08-26',
'2017-08-25', '2017-08-13', '2017-08-30', '2017-08-05', '2017-08-23')
closeday <- as.Date(closeday)

df <- data.frame(customer,country,closeday)

添加 status:
df$status <- ifelse(df$closeday < '2017-08-20', 'open', 'closed') 

customer country closeday status
1 1 BE 2017-08-23 closed
2 2 NL 2017-08-05 open
3 3 NL 2017-08-22 closed
4 4 NL 2017-08-26 closed
5 5 BE 2017-08-25 closed
6 6 NL 2017-08-13 open
7 7 BE 2017-08-30 closed
8 8 BE 2017-08-05 open
9 9 NL 2017-08-23 closed

计算 closerate
closerate <- length(which(df$status == 'closed')) / 
(length(which(df$status == 'closed')) + length(which(df$status == 'open')))

[1] 0.6666667

显然,这是总数的 closerate。面临的挑战是获取每个 closeratecountry。我尝试通过以下方式将 closerate计算添加到 df中:
df$closerate <- length(which(df$status == 'closed')) / 
(length(which(df$status == 'closed')) + length(which(df$status == 'open')))

但这给所有行的 closerate为0.66,因为我没有分组。我相信我不应该使用长度函数,因为可以通过分组来完成计数。我阅读了一些有关使用 dplyr对每个组的逻辑输出进行计数的信息,但是这没有解决。

这是所需的输出:

最佳答案

aggregate(list(output = df$status == "closed"),
list(country = df$country),
function(x)
c(close = sum(x),
open = length(x) - sum(x),
rate = mean(x)))
# country output.close output.open output.rate
#1 BE 3.00 1.00 0.75
#2 NL 3.00 2.00 0.60

在注释中有一个使用 table的解决方案,它似乎已被删除。无论如何,您也可以使用 table
output = as.data.frame.matrix(table(df$country, df$status))
output$closerate = output$closed/(output$closed + output$open)
output
# closed open closerate
#BE 3 1 0.75
#NL 3 2 0.60

关于r - 分组并计数以获得接近的价格,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46083507/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com