gpt4 book ai didi

r - 根据名称类型将 data.frame 的列加在一起

转载 作者:行者123 更新时间:2023-12-05 09:23:45 25 4
gpt4 key购买 nike

假设我有以下 data.frame,它将 R 包的名称与其所属的 CRAN 任务 View 相关联:

dictionary <- data.frame(task.view = c(rep("High.Performance.Computing", 3), rep("Machine.Learning", 3)), package = c("Rcpp", "HadoopStreaming", "rJava", "e1071", "nnet", "RWeka"))

# task.view package
# High.Performance.Computing Rcpp
# High.Performance.Computing HadoopStreaming
# High.Performance.Computing rJava
# Machine.Learning e1071
# Machine.Learning nnet
# Machine.Learning RWeka

然后我计算每个包被学生编写的四种工具之一调用的次数:

package.referals <- data.frame(Rcpp = c(1, 0, 1, 1), HadoopStreaming = c(1, 0, 0, 0),  rJava = c(1, 0, 0, 1), e1071 = c(1, 1, 1, 1), nnet = c(1, 0, 0, 0), RWeka = c(1, 0, 0, 1), row.names = paste("student pkg", 1:4))

# Rcpp HadoopStreaming rJava e1071 nnet RWeka
# student pkg 1 1 1 1 1 1 1
# student pkg 2 0 0 0 1 0 0
# student pkg 3 1 0 0 1 0 0
# student pkg 4 1 0 1 1 0 1

我如何根据我的包任务 View 关系的 data.frame 重组上面的 package.referals data.frame 的列?

例如我希望输出是

data.frame(High.Performance.Computing = c(3, 0, 1, 2), Machine.Learning = c(3, 1, 1, 2), row.names = paste("student pkg", 1:4))

# High.Performance.Computing Machine.Learning
# student pkg 1 3 3
# student pkg 2 0 1
# student pkg 3 1 1
# student pkg 4 2 2

我尝试了以下方法,但在尝试将其重组为我想要的输出(求和和转置)时卡住了:

require(data.table)

# column names of package.referals data.frame
package.referals.colnames <- names(package.referals)

# a data.table of my task view and package relations, keyed by package name
dictionary.dt <- data.table(dictionary, key = "package")

# a data.table of my package.referals data.frame, transposed, and keyed by package name
package.referals.dt <- data.table(package = package.referals.colnames, t(package.referals), key="package")

# Joining data.tables so that the package name and corresponding task view are on the same line
dt <- package.referals.dt[J(dictionary.dt)]
setkey(dt, "task.view")

# package student pkg 1 student pkg 2 student pkg 3 student pkg 4 task.view
# 1: HadoopStreaming 1 0 0 0 High.Performance.Computing
# 2: Rcpp 1 0 1 1 High.Performance.Computing
# 3: rJava 1 0 0 1 High.Performance.Computing
# 4: e1071 1 1 1 1 Machine.Learning
# 5: nnet 1 0 0 0 Machine.Learning
# 6: RWeka 1 0 0 1 Machine.Learning

最佳答案

这是一个使用 reshape 和 base R 的解决方案:

package.referals$id <- rownames(package.referals)
pkgr <- melt(package.referals, variable.name="package")
pkgr <- pkgr[pkgr$value>0,]
df <- merge(pkgr, dictionary, all.x=TRUE)
table(df$id, df$task.view)

如果你真的想使用data.table而不是merge,你可以将最后三行替换为:

pkgr <- data.table(pkgr, key="package")
dictionary <- data.table(dictionary, key="package")
df <- pkgr[dictionary]
table(df$id, df$task.view)

关于r - 根据名称类型将 data.frame 的列加在一起,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19321053/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com