gpt4 book ai didi

r - R : how to use the cores 中的并行计算

转载 作者:行者123 更新时间:2023-12-01 11:29:45 24 4
gpt4 key购买 nike

我目前正在尝试在 R 中进行并行计算。我正在尝试训练逻辑岭模型,目前我的电脑上有 4 个核心。我想将我的数据集平均分成 4 个部分,并使用每个核心来训练模型(在训练数据上)并将每个核心的结果保存到一个 vector 中。问题是我不知道该怎么做,现在我试图与 foreach 包并行,但问题是每个核心都看到相同的训练数据。这是带有 foreach 包的代码(不拆分数据):

library(ridge)
library(parallel)
library(foreach)

num_of_cores <- detectCores()
mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
data_per_core <- floor(nrow(mydata)/num_of_cores)
result <- data.frame()

r <- foreach(icount(4), .combine = cbind) %dopar% {
result <- logisticRidge(admit~ gre + gpa + rank,data = mydata)
coefficients(result)
}

知道如何同时将数据分成 x 个 block 并并行训练模型吗?

最佳答案

这样的事情怎么样?它使用 snowfall 而不是 foreach 库,但应该给出相同的结果。

library(snowfall)
library(ridge)

# for reproducability
set.seed(123)
num_of_cores <- parallel::detectCores()
mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
data_per_core <- floor(nrow(mydata)/num_of_cores)

# we take random rows to each cluster, by sampleid
mydata$sampleid <- sample(1:num_of_cores, nrow(mydata), replace = T)

# create a small function that calculates the coefficients
regfun <- function(dat) {
library(ridge) # this has to be in the function, otherwise snowfall doesnt know the logistic ridge function
result <- logisticRidge(admit~ gre + gpa + rank, data = dat)
coefs <- as.numeric(coefficients(result))
return(coefs)
}

# prepare the data
datlist <- lapply(1:num_of_cores, function(i){
dat <- mydata[mydata$sampleid == i, ]
})

# initiate the clusters
sfInit(parallel = T, cpus = num_of_cores)

# export the function and the data to the cluster
sfExport("regfun")

# calculate, (sfClusterApply is very similar to sapply)
res <- sfClusterApply(datlist, function(datlist.element) {
regfun(dat = datlist.element)
})

#stop the cluster
sfStop()

# convert the list to a data.frame. data.table::rbindlist(list(res)) does the same job
res <- data.frame(t(matrix(unlist(res), ncol = num_of_cores)))
names(res) <- c("intercept", "gre", "gpa", "rank")
res
# res
# intercept gre
# 1 -3.002592 1.558363e-03
# 2 -4.142939 1.060692e-03
# 3 -2.967130 2.315487e-03
# 4 -1.176943 4.786894e-05
# gpa rank
# 1 0.7048146997 -0.382462408
# 2 0.9978841880 -0.314589628
# 3 0.6797382218 -0.464219036
# 4 -0.0004576679 -0.007618317

关于r - R : how to use the cores 中的并行计算,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33610183/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com