gpt4 book ai didi

r - 如何将数据框随机拆分为三个具有给定行数的较小数据框

转载 作者:行者123 更新时间:2023-12-02 04:44:38 32 4
gpt4 key购买 nike

我想使用 R 将一个数据帧随机拆分为三个较小的数据帧。第一个占总观测值的 80%。第二个和第三个分别占总观测值的 15% 和 5%。三个数据框不能有任何重叠。你有什么建议吗?

最佳答案

这里有一个快速函数,可根据您在“props”参数中指定的值数量分成任意数量的组。它应该是相当 self 解释的

#' Splits data.frame into arbitrary number of groups
#'
#' @param dat The data.frame to split into groups
#' @param props Numeric vector. What proportion of the data should
#' go in each group?
#' @param which.adjust Numeric. Which group size should we 'fudge' to
#' make sure that we sample enough (or not too much)
split_data <- function(dat, props = c(.8, .15, .05), which.adjust = 1){

# Make sure proportions are positive
# and the adjustment group isn't larger than the number
# of groups specified
stopifnot(all(props >= 0), which.adjust <= length(props))

# could check to see if the sum is 1
# but this is easier
props <- props/sum(props)
n <- nrow(dat)
# How large should each group be?
ns <- round(n * props)
# The previous step might give something that
# gives sum(ns) > n so let's force the group
# specified in which.adjust to be a value that
# makes it so that sum(ns) = n
ns[which.adjust] <- n - sum(ns[-which.adjust])

ids <- rep(1:length(props), ns)
# Shuffle ids so that the groups are randomized
which.group <- sample(ids)
split(dat, which.group)
}

split_data(mtcars)
split_data(mtcars, c(.7, .3))

关于r - 如何将数据框随机拆分为三个具有给定行数的较小数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20041239/

32 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com