gpt4 book ai didi

R - 创建显示两个相似数据集之间的增量/进度的数据集

转载 作者:行者123 更新时间:2023-12-04 03:43:12 24 4
gpt4 key购买 nike

我有两个数据集,是在两个不同时间段拍摄的记录快照。每条记录都有一个唯一的键 (Player.ID)。有需要结转的标识信息(Alliance.ID、Alliance.Tag、Player.ID、Player)以及需要评估随时间变化的所有剩余属性(所有新数据点减去所有旧数据点)生成一个新的第三个数据集。最后,我们需要考虑新玩家(所有统计数据设置为 NEW)和已删除玩家(所有统计数据设置为 MISSING)。

每个数据集大约有 60,000 条记录。

Three data sets; old, new, calculated

我没有代码可以分享。任何关于方法、包或代码的建议都将受到赞赏。

最佳答案

是这样的吗?

library(tidyr)
library(dplyr)

df1$Source <- "df1"
df2$Source <- "df2"

output <- bind_rows(df1,df2) %>%
group_by(Player.ID) %>%
mutate(
DATE = as.Date(DATE,tryFormats = c("%m/%d/%Y")),
ROW.COUNT = n(),
POINTS = case_when(
ROW.COUNT == 1 & Source %in% "df1" ~ "Missing",
ROW.COUNT == 1 &
Source %in% "df2" ~ "NEW",
ROW.COUNT > 1 ~ as.character(as.numeric(POINTS) - lag(as.numeric(POINTS), order_by = DATE)),
TRUE ~ "ERROR"
),
POWER.POINTS = case_when(
ROW.COUNT == 1 & Source %in% "df1" ~ "Missing",
ROW.COUNT == 1 &
Source %in% "df2" ~ "NEW",
ROW.COUNT > 1 ~ as.character(as.numeric(POWER.POINTS) - lag(as.numeric(POWER.POINTS), order_by = DATE)),
TRUE ~ "ERROR"
),
PLAYERS.KILLED = case_when(
ROW.COUNT == 1 & Source %in% "df1" ~ "Missing",
ROW.COUNT == 1 &
Source %in% "df2" ~ "NEW",
ROW.COUNT > 1 ~ as.character(as.numeric(PLAYERS.KILLED) - lag(as.numeric(PLAYERS.KILLED), order_by = DATE)),
TRUE ~ "ERROR"
)
) %>%
filter(DATE == max(DATE)) %>%
select(-DATE,-Source,-ROW.COUNT)

数据

df1 <- structure(
list(
DATE = c("12/18/2020","12/28/2020"),
Alliance.ID = c("9745908","8798794"),
Alliance.Tag = c("StkOvflw","ILoveR"),
Player.ID = c("z90c0b60dd58","grfk349i342k3"),
PLAYER = c("Deanna","Gregor"),
LVL = c(89,22),
RANK = c("Admiral","Newb"),
POINTS = c("16746162","19269094"),
POWER.POINTS = c("77083200","87691376"),
PLAYERS.KILLED = c("7337","4698")
),
.Names = c(
"DATE",
"Alliance.ID",
"Alliance.Tag",
"Player.ID",
"PLAYER",
"LVL",
"RANK",
"POINTS",
"POWER.POINTS",
"PLAYERS.KILLED"
),
row.names = c(NA, -2L),
class = "data.frame"
)

df2 <- structure(
list(
DATE = c("1/4/2021","1/4/2021"),
Alliance.ID = c("9745908","5874162"),
Alliance.Tag = c("StkOvflw","NewGuy"),
Player.ID = c("z90c0b60dd58","2387hyf23u8i4"),
PLAYER = c("Deanna","NewGuy"),
LVL = c(90,1),
RANK = c("Admiral","Newb"),
POINTS = c("16786133","1254"),
POWER.POINTS = c("77089878","0"),
PLAYERS.KILLED = c("7368","3")
),
.Names = c(
"DATE",
"Alliance.ID",
"Alliance.Tag",
"Player.ID",
"PLAYER",
"LVL",
"RANK",
"POINTS",
"POWER.POINTS",
"PLAYERS.KILLED"
),
row.names = c(NA, -2L),
class = "data.frame"
)

关于R - 创建显示两个相似数据集之间的增量/进度的数据集,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65582536/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com