gpt4 book ai didi

arrays - 将 PostgreSQL 整数 [] 数组转换为 R 中的数字列表

转载 作者:行者123 更新时间:2023-11-29 13:51:25 25 4
gpt4 key购买 nike

我将 PostgreSQL 查询的结果存储为 R 中的 data.frame。“列”之一是 integer[] 数组类型。在 R 中,这是由 RPostgreSQL 包作为字符串导入的。

如何在我的 data.frame 中将字符串类型转换为数字列表类型列(或作为单独的数字列)?

连接并获取数据

require(RPostgreSQL)
drv = dbDriver("PostgreSQL")
con = dbConnect(drv, host = ..., post =..., dbname =..., user=..., password=...)
df = dbGetQuery(con, query_string)
dbDisconnect(con)

最小工作示例

library(dplyr)
# randomized arrays of 10 numbers
set.seed(10)
df = data.frame(id = c(1:10)) %>%
mutate(arrcol = lapply(id, function(X) sample(1:99, 10, replace=T)),
arrcol = gsub("c(","{{",arrcol,fixed=T),
arrcol = gsub(")","}}",arrcol,fixed=T))

去掉括号

df$arrcol = gsub(fixed=T, "{", "", df$arrcol)
df$arrcol = gsub(fixed=T, "}", "", df$arrcol)

转换为数字列表

# Attempt 1: 
df$arrcol = as.numeric(df$arrcol)
# Error: (list) object cannot be coerced to type 'double'

# Attempt 2:
df$arrcol = lapply(df$arrcol,
function(x) strsplit(x, ",", fixed=T))
# no error, but now the data appears to be stored as a list of character lists:
# arrcol[1]: list(c("1", "2", "3", "4", "5",...

# Attempt 3:
df$arrcol = lapply(df$arrcol,
function(x) as.numeric(
unlist(
strsplit(x, ",", fixed=T))
)
)
# this one seems to work

最佳答案

我自己的最佳答案:

df$numcol = gsub(fixed=T, "{", "", df$arrcol)
df$numcol = gsub(fixed=T, "}", "", df$numcol)

df$numcol <- lapply(df$numcol,
function(x) as.numeric(
unlist(
strsplit(x = x, split = ",", fixed=T)
)
)
)

[更新为一次性执行所有步骤]

df$numcol <- lapply(df$arrcol, 
function(x) as.numeric(
unlist(
strsplit(
x = gsub("[{]|[}]", "", x),
split = ",", fixed=T))))

或者,等价地:

df$numcol <- lapply(df$arrcol, 
function(x) as.numeric(
strsplit(
x = gsub("[{]|[}]", "", x),
split = ",", fixed=T)[[1]]
)
)

或者,(只要每个数组的长度相同)您可以使用这个技巧 (Splitting a dataframe string column into multiple different columns) 将字符串解析为单独的列。请注意,read.table 足够聪明,可以将每个新变量识别为整数。

newdf = read.table(text = df$arrcol, header = F, sep = ",")

此外,您可以轻松地将它们作为自己的列附加到原始 data.frame 上:

df = cbind(df, newdf)

或者,知道将产生多少新列:

df[,3:101] <- read.table(text = gsub("[{]|[}]", "", df$arrcol), 
header = F, sep = ",")

关于arrays - 将 PostgreSQL 整数 [] 数组转换为 R 中的数字列表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40637545/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com