gpt4 book ai didi

R基于不同列的运行计数

转载 作者:行者123 更新时间:2023-12-02 03:43:03 26 4
gpt4 key购买 nike

我想根据 ColumnA 中某些内容之前在 ColumnB 中出现的次数来获取该内容的运行计数。理想情况下,该计数也可以是 ColumnC 的子集。

例如,我想在此处获取获胜者之前的损失或失败者之前的获胜次数的总计:

#create df
year <- c(2017, 2017, 2017, 2017, 2017, 2016, 2016, 2016, 2016, 2016)
winner <- c('sam', 'ryan', 'sally', 'sally', 'ryan', 'sally', 'mike', 'ryan', 'mike', 'sam')
loser <- c('mike', 'mike', 'ryan', 'sam', 'sam', 'mike', 'sally', 'mike', 'ryan', 'sally')
df <- data.frame(year, winner, loser)

#successul methods for getting winner's cumulative wins or loser's cumulative losses
df <- as.data.table(df)[, winner_wins := seq(.N), by = "winner"][]
df <- as.data.table(df)[, loser_losses := seq(.N), by = "loser"][]

#successul methods for getting winner's cumulative wins or loser's cumulative losses by year
df <- df %>% group_by(year, winner) %>% mutate(winner_wins = row_number())
df <- df %>% group_by(year, loser) %>% mutate(loser_losses = row_number())

#failed attempt to get winner's cumulative losses by year
df <- df %>% group_by(year) %>% mutate(winner_losses = cumsum(winner == loser & year == year))

我希望输出是我的原始数据框,但有四个新列:winner_cum_wins、winner_cum_losses、loser_cum_wins、loser_cum_losses。

最佳答案

这应该为您提供所需的所有数据框:

library(tidyverse)
df %>%
group_by(year) %>%
mutate(match_id_year = row_number()) %>%
gather(outcome, name, -year, -match_id_year) %>%
arrange(year, match_id_year) %>%
group_by(year, name) %>%
mutate(cum_wins_year = cumsum(outcome == "winner"),
cum_losses_year = cumsum(outcome == "loser"))

关于R基于不同列的运行计数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49227218/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com