gpt4 book ai didi

r - 计算小于 x 的值并通过多个组找到最接近 x 的值

转载 作者:行者123 更新时间:2023-12-02 03:00:53 25 4
gpt4 key购买 nike

样本数据框 data

         uid     bas_id dist2mouth type
2020 2019 W3A9101601 2.413629 1
2021 2020 W3A9101601 2.413629 1
2022 2021 W3A9101602 2.413629 1
2023 2022 W3A9101602 3.313893 1
2032 2031 W3A9101602 3.313893 1
2033 2032 W3A9101602 3.313893 1
2034 2033 W3A9101602 3.313893 1
15023 15022 W3A9101601 1.349000 2
15025 15024 W3A9101601 3.880000 2
15026 15025 W3A9101602 3.880000 2
15027 15026 W3A9101602 0.541101 2
16106 17097 W3A9101602 1.349000 2

对于每一行,我想计算 type=2 的行数同内 bas_id有更低的 dist2mouth .有效多少行 type=2位于每一行的下游。将其存储为 ds_n_type2 .到目前为止,我已经尝试过 dplyr
ds <- data %>%
group_by(id) %>%
summarize(n_ds = sum(dist2mouth > id[dist2mouth]))

然后我想找到最近的行 type=2到每一行 type=1同内 bas_id也许使用 whichforapply环形。将其存储为 closest_uid_type2 .也许像
which(abs(x[i:n]-x[i])==min(abs(x[i:n]-x[i])))

乐于澄清

编辑 2 修改了所需的输出
         uid     bas_id dist2mouth type ds_n_type2 closest_uid_type2
2020 2019 W3A9101601 2.413629 1 1 15022
2021 2020 W3A9101601 2.413629 1 1 15022
2022 2021 W3A9101602 2.413629 1 2 15022
2023 2022 W3A9101602 3.313893 1 2 15024
2032 2031 W3A9101602 3.313893 1 2 15024
2033 2032 W3A9101602 3.313893 1 2 15024
2034 2033 W3A9101602 3.313893 1 2 15024
15023 15022 W3A9101601 1.349000 2 - -
15025 15024 W3A9101601 3.880000 2 - -
15026 15025 W3A9101602 3.880000 2 - -
15027 15026 W3A9101602 0.541101 2 - -
17097 W3A9101602 1.349000 2 - -

最佳答案

试试这个:

require(dplyr)

df %>%
group_by(bas_id) %>%
mutate(n_ds = match(dist2mouth,sort(dist2mouth))-1) %>%
mutate(closest_uid=apply(
sapply(dist2mouth,function(i)abs(i-dist2mouth)),
2,function(n) uid[which(n==sort(n)[2])])) %>%
data.frame()

输出:
  uid dist2mouth bas_id type n_ds closest_uid
1 1 10 1 1 2 4
2 2 5 1 2 0 3
3 3 6 1 1 1 2
4 4 11 1 1 3 1
5 5 3 2 2 0 6
6 6 4 2 1 1 5

编辑:

这可能不是最优雅的,但这是解决更新问题的一种方法(等待时间对其进行改进):
df$ds_n_type2[df$type==1] <- sapply(as.numeric(row.names(df[df$type==1,])), 
function(x) sum(as.numeric(df$dist2mouth[x]) > as.numeric(df$dist2mouth[df$bas_id==df$bas_id[x] & df$type==2])))

df$closest_uid_type2[df$type==1] <- sapply(as.numeric(row.names(df[df$type==1,])),
function(x) df$uid[which(df$dist2mouth==df$dist2mouth[df$bas_id==df$bas_id[x] & df$type==2][which.min(abs(c(df$dist2mouth[df$bas_id==df$bas_id[x] & df$type==2])-df$dist2mouth[x]))])[1]])

输出:
      uid     bas_id dist2mouth type ds_n_type2 closest_uid_type2
1: 2019 W3A9101601 2.413629 1 1 15022
2: 2020 W3A9101601 2.413629 1 1 15022
3: 2021 W3A9101602 2.413629 1 2 15022
4: 2022 W3A9101602 3.313893 1 2 15024
5: 2031 W3A9101602 3.313893 1 2 15024
6: 2032 W3A9101602 3.313893 1 2 15024
7: 2033 W3A9101602 3.313893 1 2 15024
8: 15022 W3A9101601 1.349000 2 NA NA
9: 15024 W3A9101601 3.880000 2 NA NA
10: 15025 W3A9101602 3.880000 2 NA NA
11: 15026 W3A9101602 0.541101 2 NA NA
12: 17097 W3A9101602 1.349000 2 NA NA

关于r - 计算小于 x 的值并通过多个组找到最接近 x 的值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46242567/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com