gpt4 book ai didi

r - 如何 reshape 数据集(从长到宽),每个类别都有两个测量列,而无需在 R 中进行额外计算

转载 作者:行者123 更新时间:2023-12-04 11:44:57 24 4
gpt4 key购买 nike

我有这个长格式的数据集(见下面的代码生成这个数据集):

region   week   average   percent
A 20 5 30
A 21 7 40
A 22 15 50
B 20 4 15
B 21 8 27
B 22 3 11
...

并且我想准备演示文稿(带有 RMarkdown 的 HTML),因此我需要将其转换为具有 averagepercent 的宽度,每个 week 像这样:

                 20                 21                  22
region average percent average percent average percent
A 5 30 7 40 15 50
B 4 15 8 27 3 11

我探索了 dcastdyplrtidyrhtmlTable 和许多其他的,但都没有成功。我不需要执行任何计算,只需以不同的格式组织数据集。

我过去曾这样做过,但我不得不编写大量代码。我想知道在 R 中是否有一种简单的方法可以做到这一点。

您可以使用此代码创建测试数据集:

region = c( "A", "A", "A", "B", "B", "B" )
week = c( "20", "21", "22", "20", "21", "22" )
average = c( 5, 7, 15, 4, 8, 3 )
percent = c( 30, 40, 50, 15, 27, 11 )

test = data.frame(
region,
week,
average,
percent
)

感谢任何帮助。

谢谢。

最佳答案

@akrun 和@Matt L. 的两个答案都很好地解决了这个问题。 @Matt L. 绝对是最简单的。谢谢你。我在这里发布我根据@akrun 的回答提出的解决方案。我肯定会在我的最终代码中使用 tidyr。

library(htmlTable)
library(data.table)
library(Hmisc)

# Create input (initial) dataset (long)
region = c( "A", "A", "A", "B", "B", "B" )
week = c( "20", "21", "22", "20", "21", "22" )
average = c( 5, 7, 15, 4, 8, 3 )
percent = c( 30, 40, 50, 15, 27, 11 )

input_ds = data.frame(
region,
week,
average,
percent
)

# Reshape the dataset into wide, using columns average and percent
# for each week
reshaped_ds = dcast(
setDT( input_ds ),
region ~ week,
value.var = c("average", "percent")
)

# Extract the week number from each column and get a list of indices
# sorted by week number
col_order <- order(
as.numeric(
sub( ".*_", "", names( reshaped_ds )[-1] )
)
)

# Re-order columns according to col_order
setcolorder(
reshaped_ds, names(reshaped_ds)[c(1, col_order + 1)]
)

# Prepare the names for group columns
col_group_names = unique(
paste(
"Week",
sub( ".*_", "", names(reshaped_ds)[-1] )
)
)

# Create another dataset so we don't mess up the reshaped_ds
final_table_ds = reshaped_ds

# Remove '_##' from column names
names( final_table_ds ) = sub(
"_.*", "", names( final_table_ds )
)

# Capitalize the first letter of each column name
names( final_table_ds ) = capitalize( names( final_table_ds ) )

# Generate the final table in HTML
htmlTable(
final_table_ds,
rnames = FALSE,
cgroup = c( "", col_group_names ),
n.cgroup = c( 1, rep( 2, length(col_group_names) ) ),
col.rgroup = c( "none", "#EDEDED" )
)

最终输出:

enter image description here

关于r - 如何 reshape 数据集(从长到宽),每个类别都有两个测量列,而无需在 R 中进行额外计算,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51676973/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com