gpt4 book ai didi

r - 识别变量的中位数,其中连续的 0 到达不同的变量

转载 作者:行者123 更新时间:2023-12-01 10:21:59 25 4
gpt4 key购买 nike

下面是我的数据的样子。

 Group, Sales,flag,Count
Paris,6738,0,15
Paris,5235,1,23
Paris,5907,1,15
Paris,5527,0,28
Paris,6934,1,27
Paris,6757,0,20
Paris,5394,1,31
Paris,5379,0,36
Paris,6266,1,40
Paris,5512,1,39
Paris,6506,1,29
Paris,5006,1,22
Paris,6465,1,17
Paris,6653,0,38
Paris,6719,0,12
New York,5333,1,19
New York,6763,1,37
New York,6468,0,32
New York,6923,0,34
New York,6705,0,16
New York,6542,0,11
New York,6497,0,19
New York,6616,0,27
New York,6788,0,26
New York,5876,1,33
New York,5382,0,40
New York,5688,0,34
New York,6667,1,20
New York,5929,1,28
New York,6096,0,30

对于每个城市,我想计算每个城市在标志“1”前后的连续零的销售额中位数。

下面是我在使用下面的代码后得到的输出,在评论中建议。

setDT(c)[, .(median(Sales), median(Count)), .(City, rleid(flag))][rleid %% 2 == 1, .(City, median = V1, count = V2)]

下面是我使用建议代码后得到的输出。

head(d,20)
City median count
1: Paris 6738.000 15.00000
2: Paris 5527.000 28.00000
3: Paris 6757.000 20.00000
4: Paris 5379.000 36.00000
5: Paris 6686.000 25.00000
6: NY 6648.429 23.57143
7: NY 5535.000 37.00000
8: NY 6096.000 30.00000

预期输出已附在下面。纽约组出现差异,(销售和数量的中位数)

R代码输出结果:6. NY - 6648.429 和伯爵 - 23.57

Excel输出结果:NY - 6616 和伯爵 - 26

enter image description here

谢谢,周杰伦

最佳答案

基-R

x <- read.csv(header=TRUE, stringsAsFactors=FALSE, text='
City, Sales, flag
Paris, 3000, 0
Paris, 4000, 0
Paris, 5000, 0
Paris, 3000, 1
Paris, 3000, 0
Paris, 4000, 0
Paris, 4500, 0
NY, 3000, 1
NY, 4000, 0
NY, 5000, 0
NY, 3000, 1
NY, 3000, 0
NY, 4000, 0
NY, 4500, 1')

do.call(rbind,
by(x, list(x$City, cumsum(c(0,diff(x$flag)!=0))),
function(a) { a$Sales <- mean(a$Sales) ; a[1,,drop=FALSE] ; }))
# City Sales flag
# 1 Paris 4000.000 0
# 4 Paris 3000.000 1
# 5 Paris 3833.333 0
# 8 NY 3000.000 1
# 9 NY 4500.000 0
# 11 NY 3000.000 1
# 12 NY 3500.000 0
# 14 NY 4500.000 1

dplyr

library(dplyr)
x %>%
mutate(flaggroup = cumsum(c(0,diff(flag)!=0))) %>%
group_by(City, flaggroup) %>%
summarize(Sales = mean(Sales), flag = first(flag)) %>%
ungroup() %>%
select(-flaggroup)
# # A tibble: 8 × 3
# City Sales flag
# <chr> <dbl> <int>
# 1 NY 3000.000 1
# 2 NY 4500.000 0
# 3 NY 3000.000 1
# 4 NY 3500.000 0
# 5 NY 4500.000 1
# 6 Paris 4000.000 0
# 7 Paris 3000.000 1
# 8 Paris 3833.333 0

关于r - 识别变量的中位数,其中连续的 0 到达不同的变量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50296320/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com