gpt4 book ai didi

python - 如何重组 R 中由 8 个重复行和 24 列组成的数据?

转载 作者:行者123 更新时间:2023-11-30 04:07:52 25 4
gpt4 key购买 nike

我是 R 的新手,所以请耐心等待(同时我会尽可能描述性地尊重您的时间)。

我有一段时间以来一直在尝试正确格式化的数据,它描述了一年多来每小时进行的 8 次测量。由于我必须检索数据的方式,我现在拥有的电子表格以表格格式列出数据,其中 8 个变量名称列为重复行,一天中的每个小时作为一个单独的列,如下所示:

var1[0] var1[1] var1[2] var1[3] var1[4] var1[5] var1[6] var1[7] var1[8] var1[9] var1[10] var1[11] var1[12] var1[13] var1[14] var1[15] var1[16] var1[17] var1[18] var1[19] var1[20] var1[21] var1[22] var1[23]
var3[0] var2[1] var2[2] var2[3] var2[4] var2[5] var2[6] var2[7] var2[8] var2[9] var2[10] var2[11] var2[12] var2[13] var2[14] var2[15] var2[16] var2[17] var2[18] var2[19] var2[20] var2[21] var2[22] var2[23]
var3[0] var3[1] var3[2] var3[3] var3[4] var3[5] var3[6] var3[7] var3[8] var3[9] var3[10] var3[11] var3[12] var3[13] var3[14] var3[15] var3[16] var3[17] var3[18] var3[19] var3[20] var3[21] var3[22] var3[23]
var4[0] var4[1] var4[2] var4[3] var4[4] var4[5] var4[6] var4[7] var4[8] var4[9] var4[10] var4[11] var4[12] var4[13] var4[14] var4[15] var4[16] var4[17] var4[18] var4[19] var4[20] var4[21] var4[22] var4[23]
var5[0] var5[1] var5[2] var5[3] var5[4] var5[5] var5[6] var5[7] var5[8] var5[9] var5[10] var5[11] var5[12] var5[13] var5[14] var5[15] var5[16] var5[17] var5[18] var5[19] var5[20] var5[21] var5[22] var5[23]
var6[0] var6[1] var6[2] var6[3] var6[4] var6[5] var6[6] var6[7] var6[8] var6[9] var6[10] var6[11] var6[12] var6[13] var6[14] var6[15] var6[16] var6[17] var6[18] var6[19] var6[20] var6[21] var6[22] var6[23]
var7[0] var7[1] var7[2] var7[3] var7[4] var7[5] var7[6] var7[7] var7[8] var7[9] var7[10] var7[11] var7[12] var7[13] var7[14] var7[15] var7[16] var7[17] var7[18] var7[19] var7[20] var7[21] var7[22] var7[23]
var8[0] var8[1] var8[2] var8[3] var8[4] var8[5] var8[6] var8[7] var8[8] var8[9] var8[10] var8[11] var8[12] var8[13] var8[14] var8[15] var8[16] var8[17] var8[18] var8[19] var8[20] var8[21] var8[22] var8[23]

var1[24] var1[25] var1[26] var1[27] var1[28] var1[29] var1[30] var1[31] var1[32] var1[33] var1[34] var1[35] var1[36] var1[37] var1[38] var1[39] var1[40] var1[41] var1[42] var1[43] var1[44] var1[45] var1[46] var1[47]
var2[24] var2[25] var2[26] var2[27] var2[28] var2[29] var2[30] var2[31] var2[32] var2[33] var2[34] var2[35] var2[36] var2[37] var2[38] var2[39] var2[40] var2[41] var2[42] var2[43] var2[44] var2[45] var2[46] var2[47]
var3[24] var3[25] var3[26] var3[27] var3[28] var3[29] var3[30] var3[31] var3[32] var3[33] var3[34] var3[35] var3[36] var3[37] var3[38] var3[39] var3[40] var3[41] var3[42] var3[43] var3[44] var3[45] var3[46] var3[47]
var4[24] var4[25] var4[26] var4[27] var4[28] var4[29] var4[30] var4[31] var4[32] var4[33] var4[34] var4[35] var4[36] var4[37] var4[38] var4[39] var4[40] var4[41] var4[42] var4[43] var4[44] var4[45] var4[46] var4[47]
var5[24] var5[25] var5[26] var5[27] var5[28] var5[29] var5[30] var5[31] var5[32] var5[33] var5[34] var5[35] var5[36] var5[37] var5[38] var5[39] var5[40] var5[41] var5[42] var5[43] var5[44] var5[45] var5[46] var5[47]
var6[24] var6[25] var6[26] var6[27] var6[28] var6[29] var6[30] var6[31] var6[32] var6[33] var6[34] var6[35] var6[36] var6[37] var6[38] var6[39] var6[40] var6[41] var6[42] var6[43] var6[44] var6[45] var6[46] var6[47]
var7[24] var7[25] var7[26] var7[27] var7[28] var7[29] var7[30] var7[31] var7[32] var7[33] var7[34] var7[35] var7[36] var7[37] var7[38] var7[39] var7[40] var7[41] var7[42] var7[43] var7[44] var7[45] var7[46] var7[47]
var8[24] var8[25] var8[26] var8[27] var8[28] var8[29] var8[30] var8[31] var8[32] var8[33] var8[34] var8[35] var8[36] var8[37] var8[38] var8[39] var8[40] var8[41] var8[42] var8[43] var8[44] var8[45] var8[46] var8[47]

最初有很多数据,但我已将其精简到这一点,以便深入了解我一直遇到的问题。 (在上面的示例中,我想表达的是,在每个小时(t1、t2、t3 等)记录变量(var1、var2、var3 等)。

我的目标是重新格式化它,使其类似于这样:

var1[0] var2[0] var3[0] var4[0] var5[0] var6[0] var7[0] var8[0]
var1[1] var2[1] var3[1] var4[1] var5[1] var6[1] var7[1] var8[1]
var1[2] var2[2] var3[2] var4[2] var5[2] var6[2] var7[2] var7[2]
var1[3] var2[3] var3[3] var4[3] var5[3] var6[3] var7[3] var7[3]
. . . . . . . .
. . . . . . . .
. . . . . . . .
[all the way to 9216, which is the number of hours in 384 days]

到目前为止,我已经尝试在 Excel 中使用它很长时间,但找不到执行此操作的方法。我也像以前一样研究过编写 C++ 脚本,但我觉得可能有更简单的方法。我最近的努力是转向 R,因为我一直在努力学习它,而且我听说它非常适合这种数据操作。对于 R,我试图遵循一个我发现的示例,该示例让我将数据重新创建为不同长度的矩阵(找到 here ),但这导致了非常错误的数据。 (我确定我可能滥用了该方法)。我还研究了讨论的解决方案 here ,但我无法修改代码以适应我的情况。也许我忽略了一些简单的事情?

有人有什么建议吗?正如我所说,此时我正尝试在 R 中执行此操作,但我愿意接受 Excel、C 或 python 中的建议。 (我绝对愿意接受其他语言的建议,但这可能需要更彻底的解释:))

谢谢!

[编辑:]

以上数据样本是描述性的。下面是数据的实际前 25 行的样子;出于保密原因,我所做的唯一更改是替换变量名称:

Metric,Year,Month,Day,DOW,12am,1am,2am,3am,4am,5am,6am,7am,8am,9am,10am,11am,12pm,1pm,2pm,3pm,4pm,5pm,6pm,7pm,8pm,9pm,10pm,11pm
varA,2013,1,20,Sun,0,0,0,0,0,0,0,0,0,9,22,10,18,24,26,11,21,24,10,0,0,0,0,0
varB,2013,1,20,Sun,0,0,0,0,0,0,0,0,0,10,13,18,28,26,25,25,21,23,13,0,0,0,0,0
varC,2013,1,20,Sun,0,0,0,0,0,0,0,0,0,0,1,7,9,5,1,4,4,1,7,1,0,0,0,0
varD,2013,1,20,Sun,0,0,0,0,0,0,0,0,0,9,23,17,27,29,27,15,25,25,17,1,0,0,0,0
varE,2013,1,20,Sun,0,0,0,0,0,0,0,0,0,44,32,33,65,37,42,62,75,71,50,0,0,0,0,0
varF,2013,1,20,Sun,0,0,0,0,0,0,0,0,0,89,82,83,94,37,77,100,100,90,60,0,0,0,0,0
varG,2013,1,20,Sun,0,0,0,0,0,0,0,0,0,100,100,100,100,95,100,100,100,100,100,0,0,0,0,0
varH,2013,1,20,Sun,0,0,0,0,0,0,0,0,0,9,10,92,12,101,34,14,64,29,86,0,0,0,0,0
varA,2013,1,21,Mon,0,0,0,0,0,0,0,0,0,5,12,23,20,22,24,9,19,15,12,13,9,0,0,0
varB,2013,1,21,Mon,0,0,0,0,0,0,0,0,0,6,14,21,27,26,23,19,22,16,16,16,12,0,0,0
varC,2013,1,21,Mon,0,0,0,0,0,0,0,0,0,2,5,4,10,6,10,2,7,7,4,5,5,0,0,0
varD,2013,1,21,Mon,0,0,0,0,0,0,0,0,0,7,18,27,30,28,34,12,26,22,16,18,14,0,0,0
varE,2013,1,21,Mon,0,0,0,0,0,0,0,0,0,0,50,20,15,67,33,71,47,36,64,58,67,0,0,0
varF,2013,1,21,Mon,0,0,0,0,0,0,0,0,0,60,70,45,70,90,67,100,100,79,91,92,89,0,0,0
varG,2013,1,21,Mon,0,0,0,0,0,0,0,0,0,100,100,100,100,100,94,100,100,100,91,100,100,0,0,0
varH,2013,1,21,Mon,0,0,0,0,0,0,0,0,0,20,12,31,20,29,16,12,12,16,16,34,41,0,0,0
varA,2013,1,22,Tue,0,0,0,0,0,0,0,0,0,9,14,18,25,16,20,22,11,23,13,9,4,0,0,0
varB,2013,1,22,Tue,0,0,0,0,0,0,0,0,0,20,23,17,28,14,18,30,17,27,17,17,6,0,0,0
varC,2013,1,22,Tue,0,0,0,0,0,0,0,0,0,4,8,2,3,2,6,7,2,4,1,2,1,0,0,0
varD,2013,1,22,Tue,0,0,0,0,0,0,0,0,0,13,22,20,29,18,26,29,13,27,14,11,5,0,0,0
varE,2013,1,22,Tue,0,0,0,0,0,0,0,0,0,83,90,43,30,29,17,32,60,71,54,89,100,0,0,0
varF,2013,1,22,Tue,0,0,0,0,0,0,0,0,0,100,100,86,65,43,56,74,90,90,73,100,100,0,0,0
varG,2013,1,22,Tue,0,0,0,0,0,0,0,0,0,100,100,100,100,100,100,100,100,100,100,100,100,0,0,0
varH,2013,1,22,Tue,0,0,0,0,0,0,0,0,0,14,23,17,30,16,14,12,8,9,13,14,6,0,0,0

如您所见,在完整的数据集中,开头有五列额外的列,分别对应变量名称和日期信息。

最佳答案

假设您的数据在矩阵 M 中,这应该有效:

output <- NULL
last.count <- 9216/8 - 1
for (i in 0:last.count) {
output <- rbind(output, t(M[8*i + 1:8,]))
}

ps: rbind 可能很慢(取决于数据大小),在这种情况下你可以预先分配output matrix

关于python - 如何重组 R 中由 8 个重复行和 24 列组成的数据?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22162174/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com