gpt4 book ai didi

R:如何在保留其他列的同时聚合一些列

转载 作者:行者123 更新时间:2023-12-01 07:24:01 24 4
gpt4 key购买 nike

我遇到了类似的问题 here ,但我尝试过的所有解决方案都没有奏效。

给定一个这样的表:

Date    Exercise    Category    Weight  Reps    EstMax  RepxWeight  Note
4/2/16 Deadlift Legs 135 7 166.4685 7x135 easy
4/2/16 Deadlift Legs 135 7 166.4685 7x135 kinda easy
4/2/16 Deadlift Legs 135 7 166.4685 7x135 tired
4/2/16 Bench Press Chest 95 5 110.8175 5x95 hard
4/2/16 Bench Press Chest 135 2 143.991 2x135 not hard
4/9/16 Bench Press Chest 135 2 143.991 2x135 a little hard
4/9/16 Bench Press Chest 135 2 143.991 2x135 super tired
4/18/16 Deadlift Legs 155 8 196.292 8x155 …
4/18/16 Deadlift Legs 155 5 180.8075 5x155 bad day
5/8/16 Deadlift Legs 185 3 203.4815 3x185 good day
5/8/16 Deadlift Legs 185 3 203.4815 3x185 felt easy
5/8/16 Bench Press Chest 115 4 130.318 4x115 easy
5/8/16 Bench Press Chest 115 4 130.318 4x115 hard

我要 aggregate获取具有 max 的行基于多个其他列(例如 EstMaxDate )的特定列(例如 Exercise )的值,但也保留该行中的所有其他列。如果多个条目具有相同的最大值,则取第一个条目。

预期的输出如下所示:
Date    Exercise    Category    Weight  Reps    EstMax  RepxWeight  Note
4/2/16 Deadlift Legs 135 7 166.4685 7x135 easy
4/2/16 Bench Press Chest 135 2 143.991 2x135 not hard
4/9/16 Bench Press Chest 135 2 143.991 2x135 a little hard
4/18/16 Deadlift Legs 155 8 196.292 8x155 …
5/8/16 Deadlift Legs 185 3 203.4815 3x185 good day
5/8/16 Bench Press Chest 115 4 130.318 4x115 hard

我尝试过的一些方法的例子;在每种情况下,“额外的列”最终都被用作聚合的因素,这不是我想要的。
data <- structure(list(Date = structure(c(2L, 2L, 2L, 2L, 2L, 3L, 3L, 
1L, 1L, 4L, 4L, 4L, 4L), .Label = c("4/18/16", "4/2/16", "4/9/16",
"5/8/16"), class = "factor"), Exercise = structure(c(2L, 2L,
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L), .Label = c("Bench Press",
"Deadlift"), class = "factor"), Category = structure(c(2L, 2L,
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L), .Label = c("Chest",
"Legs"), class = "factor"), Weight = c(135L, 135L, 135L, 95L,
135L, 135L, 135L, 155L, 155L, 185L, 185L, 115L, 115L), Reps = c(7L,
7L, 7L, 5L, 2L, 2L, 2L, 8L, 5L, 3L, 3L, 4L, 4L), EstMax = c(166.4685,
166.4685, 166.4685, 110.8175, 143.991, 143.991, 143.991, 196.292,
180.8075, 203.4815, 203.4815, 130.318, 130.318), RepxWeight = structure(c(6L,
6L, 6L, 5L, 1L, 1L, 1L, 7L, 4L, 2L, 2L, 3L, 3L), .Label = c("2x135",
"3x185", "4x115", "5x155", "5x95", "7x135", "8x155"), class = "factor"),
Note = structure(c(4L, 8L, 11L, 7L, 9L, 2L, 10L, 1L, 3L,
6L, 5L, 4L, 7L), .Label = c("…", "a little hard", "bad day",
"easy", "felt easy", "good day", "hard", "kinda easy", "not hard",
"super tired", "tired"), class = "factor")), .Names = c("Date",
"Exercise", "Category", "Weight", "Reps", "EstMax", "RepxWeight",
"Note"), class = "data.frame", row.names = c(NA, -13L))

# base R
aggregate(EstMax ~ Date + Exercise, data = data, FUN = max)
# Date Exercise EstMax
# 1 4/2/16 Bench Press 143.9910
# 2 4/9/16 Bench Press 143.9910
# 3 5/8/16 Bench Press 130.3180
# 4 4/18/16 Deadlift 196.2920
# 5 4/2/16 Deadlift 166.4685
# 6 5/8/16 Deadlift 203.4815

aggregate(EstMax ~ Date + Exercise + RepxWeight + Note, data = data, FUN = max)
# Date Exercise RepxWeight Note EstMax
# 1 4/18/16 Deadlift 8x155 … 196.2920
# 2 4/9/16 Bench Press 2x135 a little hard 143.9910
# 3 4/18/16 Deadlift 5x155 bad day 180.8075
# 4 5/8/16 Bench Press 4x115 easy 130.3180
# 5 4/2/16 Deadlift 7x135 easy 166.4685
# 6 5/8/16 Deadlift 3x185 felt easy 203.4815
# 7 5/8/16 Deadlift 3x185 good day 203.4815
# 8 5/8/16 Bench Press 4x115 hard 130.3180
# 9 4/2/16 Bench Press 5x95 hard 110.8175
# 10 4/2/16 Deadlift 7x135 kinda easy 166.4685
# 11 4/2/16 Bench Press 2x135 not hard 143.9910
# 12 4/9/16 Bench Press 2x135 super tired 143.9910
# 13 4/2/16 Deadlift 7x135 tired 166.4685


# data table
library("data.table")
data_dt <- data.table(data)
data_dt[ , max(EstMax), by = c("Date", "Exercise")]
# Date Exercise V1
# 1: 4/2/16 Deadlift 166.4685
# 2: 4/2/16 Bench Press 143.9910
# 3: 4/9/16 Bench Press 143.9910
# 4: 4/18/16 Deadlift 196.2920
# 5: 5/8/16 Deadlift 203.4815
# 6: 5/8/16 Bench Press 130.3180

data_dt[, max(EstMax), .(Date, Exercise, Weight, Reps, RepxWeight, Note)]
# Date Exercise Weight Reps RepxWeight Note V1
# 1: 4/2/16 Deadlift 135 7 7x135 easy 166.4685
# 2: 4/2/16 Deadlift 135 7 7x135 kinda easy 166.4685
# 3: 4/2/16 Deadlift 135 7 7x135 tired 166.4685
# 4: 4/2/16 Bench Press 95 5 5x95 hard 110.8175
# 5: 4/2/16 Bench Press 135 2 2x135 not hard 143.9910
# 6: 4/9/16 Bench Press 135 2 2x135 a little hard 143.9910
# 7: 4/9/16 Bench Press 135 2 2x135 super tired 143.9910
# 8: 4/18/16 Deadlift 155 8 8x155 … 196.2920
# 9: 4/18/16 Deadlift 155 5 5x155 bad day 180.8075
# 10: 5/8/16 Deadlift 185 3 3x185 good day 203.4815
# 11: 5/8/16 Deadlift 185 3 3x185 felt easy 203.4815
# 12: 5/8/16 Bench Press 115 4 4x115 easy 130.3180
# 13: 5/8/16 Bench Press 115 4 4x115 hard 130.3180

特别喜欢基础 R 解决方案。还看到了 which.max()功能可能会有所帮助,但无法弄清楚如何将其应用于此。

我看过但没有解决的其他相关问题:

Adding a non-aggregated column to an aggregated data set based on the aggregation of another column

Only keep min value for each factor level

How to select the row with the maximum value in each group

aggregating multiple columns in data.table

How to aggregate some columns while keeping other columns in R?

最佳答案

我知道您正在寻求基本的 R 解决方案,但与此同时,这里有一个 dplyr一:

library(dplyr)

data %>%
group_by(Date, Exercise) %>%
slice(which.max(EstMax))

# # A tibble: 6 x 8
# # Groups: Date, Exercise [6]
# Date Exercise Category Weight Reps EstMax RepxWeight Note
# <fctr> <fctr> <fctr> <int> <int> <dbl> <fctr> <fctr>
# 1 4/18/16 Deadlift Legs 155 8 196.2920 8x155 …
# 2 4/2/16 Bench Press Chest 135 2 143.9910 2x135 not hard
# 3 4/2/16 Deadlift Legs 135 7 166.4685 7x135 easy
# 4 4/9/16 Bench Press Chest 135 2 143.9910 2x135 a little hard
# 5 5/8/16 Bench Press Chest 115 4 130.3180 4x115 easy
# 6 5/8/16 Deadlift Legs 185 3 203.4815 3x185 good day

编辑
data.table不是我的强项,但为了完整起见,这是我的尝试:
library(data.table)

setDT(data)[, .SD[which.max(EstMax)], by = .(Date, Exercise)]

# Date Exercise Category Weight Reps EstMax RepxWeight Note
# 1: 4/2/16 Deadlift Legs 135 7 166.4685 7x135 easy
# 2: 4/2/16 Bench Press Chest 135 2 143.9910 2x135 not hard
# 3: 4/9/16 Bench Press Chest 135 2 143.9910 2x135 a little hard
# 4: 4/18/16 Deadlift Legs 155 8 196.2920 8x155 …
# 5: 5/8/16 Deadlift Legs 185 3 203.4815 3x185 good day
# 6: 5/8/16 Bench Press Chest 115 4 130.3180 4x115 easy

关于R:如何在保留其他列的同时聚合一些列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47418127/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com