gpt4 book ai didi

使用 id、时间和具有多个数据变量的一列 reshape R 中的数据框

转载 作者:行者123 更新时间:2023-12-04 09:25:47 25 4
gpt4 key购买 nike

我没有找到任何指定如何使用时间列、id 列和具有多个变量的列来 reshape 数据框的任何内容,这些变量我想在不同的列中使用。

如果只需要两个类别,它的微不足道:

How to reshape data from long to wide format?

Reshaping data frame in R

但是,我有:

geo    time    indic_na    value
AT 2014Q1 B11 2556
BE 2014Q1 B11 1506.0
... ... ... ...
AT 2014Q1 B1G 72065.1

而且我要:
geo    time    B11       B1G       ...
AT 2014Q1 2556 72065.1 ...
AT 2013Q4 2535.4 ...
... ... ... ... ...
BE 2014Q1 1506.0 86513.0 ...

所以我希望 indic_na 中的每个唯一字符串都成为一个列变量。要获取数据:
install.packages("SmarterPoland")
library(zoo)
library(SmarterPoland)
GDP <- getEurostatRCV(kod = "namq_gdp_c")
GDP$time = as.yearqtr(GDP$time)
GDP <- subset(GDP, (s_adj == "SWDA") & (unit == "MIO_EUR") & (time > "1989Q4"))

然后我尝试:
testvector <- as.vector(unique(GDP$indic_na))
test <- reshape(data = GDP, direction = "long", idvar = "geo", timevar = "time", varying = testvector)

在 maaany 其他“变化”的事情中;-) 我收到此错误消息:

Error in guess(varying) :

failed to guess time-varying variables from their names



我感觉如此接近!但不知何故,我无法告诉 R 变量在我的数据框的第三列中。我在网上找到的所有示例都已经在不同的列中具有不同的变量,或者只有 id OR time 和一列变量。

任何帮助都会很棒!

易于重现的数据
> dput(head(GDP))
structure(list(geo = structure(c(1L, 3L, 4L, 5L, 6L, 7L), .Names = c("SWDA,MIO_EUR,B11,AT",
"SWDA,MIO_EUR,B11,BE", "SWDA,MIO_EUR,B11,BG", "SWDA,MIO_EUR,B11,CH",
"SWDA,MIO_EUR,B11,CY", "SWDA,MIO_EUR,B11,CZ"), .Label = c("AT",
"BA", "BE", "BG", "CH", "CY", "CZ", "DE", "DK", "EA", "EA12",
"EA17", "EA18", "EE", "EL", "ES", "EU15", "EU27", "EU28", "FI",
"FR", "HR", "HU", "IE", "IS", "IT", "JP", "LT", "LU", "LV", "ME",
"MK", "MT", "NL", "NO", "PL", "PT", "RO", "RS", "SE", "SI", "SK",
"TR", "UK", "US"), class = "factor"), time = structure(c(2014,
2014, 2014, 2014, 2014, 2014), class = "yearqtr"), indic_na = structure(c(1L,
1L, 1L, 1L, 1L, 1L), .Names = c("SWDA,MIO_EUR,B11,AT", "SWDA,MIO_EUR,B11,BE",
"SWDA,MIO_EUR,B11,BG", "SWDA,MIO_EUR,B11,CH", "SWDA,MIO_EUR,B11,CY",
"SWDA,MIO_EUR,B11,CZ"), .Label = c("B11", "B111", "B112", "B1G",
"B1GM", "B1GM_XE", "B1GM_XI", "B1GM_XO", "B2G_B3G", "D1", "D2_M_D3",
"D21_M_D31", "P3", "P3_P5", "P3_S13", "P31_S13", "P31_S14", "P31_S14_S15",
"P31_S15", "P32_S13", "P5", "P51", "P52", "P52_P53", "P53", "P6",
"P7"), class = "factor"), value = c(2556.8, 1506, NA, NA, NA,
3056.1)), .Names = c("geo", "time", "indic_na", "value"), row.names = 7753:7758, class = "data.frame")

最佳答案

谢谢你这么清楚的问题!新用户很少见。我推荐 reshape2reshape .

GDP <- subset(GDP, (s_adj == "SWDA") & (unit == "MIO_EUR") & (time > "1989Q4"),
select = c("geo", "time", "indic_na", "value"))
# Making your data match your example

library(reshape2)
GDP_wide <- dcast(GDP, geo + time ~ indic_na, value.var = "value")

> head(GDP_wide)
geo time B11 B111 B112 ...
1 AT 1990 Q1 -64.3 -1407.1 1337.6
2 AT 1990 Q2 -37.2 -1432.0 1450.3
3 AT 1990 Q3 -39.4 -1457.4 1544.2
4 AT 1990 Q4 -78.7 -1546.7 1592.7
5 AT 1991 Q1 -140.2 -1771.9 1583.0
6 AT 1991 Q2 -183.7 -1938.5 1568.3

关于使用 id、时间和具有多个数据变量的一列 reshape R 中的数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24335891/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com