gpt4 book ai didi

python - 基于两列作为索引创建新变量,一列作为新变量名称 python pandas 或 R

转载 作者:行者123 更新时间:2023-11-30 23:17:11 26 4
gpt4 key购买 nike

阅读问题后,如果有更好的措辞,请帮助我编辑标题。

我的数据如下所示:

Location    Date    Item    Price
12 1 A 1
12 2 A 2
12 3 A 4
13 1 A 1
13 2 A 4
12 1 B 1
12 2 B 8
13 1 B 1
13 2 B 2
13 3 B 11

我想使用位置和日期为每个项目创建一个新变量,即该项目的价格,例如,我想要的输出是:

Location    Date    PriceA   PriceB
12 1 1 1
12 2 2 8
12 3 4 NaN
13 1 1 1
13 2 4 2
13 3 NaN 11

最佳答案

您可以尝试从base R开始reshape

 reshape(df, idvar=c('Location', 'Date'), timevar='Item', direction='wide')
# Location Date Price.A Price.B
#1 12 1 1 1
#2 12 2 2 8
#3 12 3 4 NA
#4 13 1 1 1
#5 13 2 4 2
#10 13 3 NA 11

或者

library(reshape2)
dcast(df, Location+Date~paste0('Price',Item), value.var='Price')
# Location Date PriceA PriceB
#1 12 1 1 1
#2 12 2 2 8
#3 12 3 4 NA
#4 13 1 1 1
#5 13 2 4 2
#6 13 3 NA 11

或者您可以在转换为 data.table 后使用 dcast.data.table (会更快)

library(data.table)
dcast.data.table(setDT(df)[,Item:=paste0('Price', Item)],
...~Item, value.var='Price')

或者

library(tidyr)
library(dplyr)
spread(df, Item, Price) %>%
rename(PriceA=A, PriceB=B)
# Location Date PriceA PriceB
#1 12 1 1 1
#2 12 2 2 8
#3 12 3 4 NA
#4 13 1 1 1
#5 13 2 4 2
#6 13 3 NA 11

更新

如果您不需要 Price 作为前缀,只需执行以下操作:

dcast.data.table(setDT(df), ...~Item, value.var='Price')

并且reshape2选项将是

dcast(df,...~Item, value.var='Price')

数据

df <- structure(list(Location = c(12L, 12L, 12L, 13L, 13L, 12L, 12L, 
13L, 13L, 13L), Date = c(1L, 2L, 3L, 1L, 2L, 1L, 2L, 1L, 2L,
3L), Item = c("A", "A", "A", "A", "A", "B", "B", "B", "B", "B"
), Price = c(1L, 2L, 4L, 1L, 4L, 1L, 8L, 1L, 2L, 11L)), .Names = c("Location",
"Date", "Item", "Price"), class = "data.frame", row.names = c(NA,
-10L))

关于python - 基于两列作为索引创建新变量,一列作为新变量名称 python pandas 或 R,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27415491/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com