gpt4 book ai didi

r - 清理多个 DataFrame 的列名称

转载 作者:行者123 更新时间:2023-12-02 03:57:19 26 4
gpt4 key购买 nike

我想清理多个数据框的列名,而不是简单地一次只清理一个。请参阅下面的代码。

#Create data frame with basic data
patientID <- c(1, 2, 3, 4)
AdmDate <- as.POSIXct(c('2010-10-11','2008-3-25','2016-4-23','2011-6-12'))
diabetes <- c("Type1", "Type2", "Type1", "Type2")
`p-status` <- c("Poor", "Improved", "Excellent", "Poor")
patientdata <- data.frame(`patient ID`, `Adm Date`, diabetes, `p-status`)
patientdata

#Find and replace spaces in column names
names(patientdata) <- str_replace_all(names(patientdata)," *",'')

#Find and replace hyphen in column name
names(patientdata) <- str_replace_all(names(patientdata),"-",'')

names(patientdata)

我需要在至少两个不同的数据帧上执行这些相同的过程(替换列名称中的空格/句点和连字符),但我无法向 str_replace_all 提供列名称向量。执行此操作的正常方法将需要为每个数据帧至少使用 3 个不同的 str_replace all 语句。另外,我正在使用的数据框的命名不同(如 order_table 和 sales_table)。考虑如何用更少的代码行来做到这一点?

最佳答案

以下是分步过程的示例:

#Create data frame with basic data
`patient ID` <- c(1, 2, 3, 4)
`Adm Date` <- as.POSIXct(c('2010-10-11','2008-3-25','2016-4-23','2011-6-12'))
diabetes <- c("Type1", "Type2", "Type1", "Type2")
`p-status` <- c("Poor", "Improved", "Excellent", "Poor")
patientdata <- data.frame(`patient ID`, `Adm Date`, diabetes, `p-status`, check.names=FALSE)

#Create copies
patientdata2 <- patientdata3 <- patientdata4 <- patientdata

#Make list with all data frames
lst <- mget(ls(pattern="^patientdata"))

#Create Single Function to house all operations

nameChange <- function(df) {
names(df) <- str_replace_all(names(df)," *",'')
names(df) <- str_replace_all(names(df),"-",'')
return(df)
}

#Iterate over all data frames
library(stringr)
lapply(lst, nameChange)
# $patientdata
# patientID AdmDate diabetes pstatus
# 1 1 2010-10-11 Type1 Poor
# 2 2 2008-03-25 Type2 Improved
# 3 3 2016-04-23 Type1 Excellent
# 4 4 2011-06-12 Type2 Poor
#
# $patientdata2
# patientID AdmDate diabetes pstatus
# 1 1 2010-10-11 Type1 Poor
# 2 2 2008-03-25 Type2 Improved
# 3 3 2016-04-23 Type1 Excellent
# 4 4 2011-06-12 Type2 Poor
#
# $patientdata3
# patientID AdmDate diabetes pstatus
# 1 1 2010-10-11 Type1 Poor
# 2 2 2008-03-25 Type2 Improved
# 3 3 2016-04-23 Type1 Excellent
# 4 4 2011-06-12 Type2 Poor

如果愿意的话,我们还可以避免创建列表:

patientdata <- nameChange(patientdata)
patientdata2 <- nameChange(patientdata2)
patientdata3 <- nameChange(patientdata3)

关于r - 清理多个 DataFrame 的列名称,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38810199/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com