gpt4 book ai didi

R DBI SQL Server : dbWriteTable truncates rows/field. 类型参数不起作用

转载 作者:行者123 更新时间:2023-12-05 03:43:18 36 4
gpt4 key购买 nike

我正在使用 SQL Server 数据库和 JDBC 以及带有 sqljdbc42.jar 驱动程序的池。

代码:

library(DBI)
library(RJDBC)
library(pool)

jar.path.ms.sql.driver <- "./www/base/sql_drivers/sqljdbc42.jar"

jdbc.sql.driver <- JDBC(
driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
classPath = jar.path.ms.sql.driver
)
sql.url <- "jdbc:sqlserver://sql01" # our server url
sql.dname <- "my_projects"
sql.username <- "user"
sql.password <- "password"

pool <- dbPool(
drv = jdbc.sql.driver,
url = sql.url,
dname = sql.dname,
username = sql.username,
password = sql.password
)

到目前为止,还不错。我可以使用 dbReadTable 从数据库中读取表格。我还可以使用 dbWriteTable 将表写入数据库。对于写入案例,我在数据框中有要用很长的字符串写入的列。在某些情况下,字符串的长度超过 255 个字符。

我曾希望使用 dbWriteTable 函数的 field.types 参数来正确配置 SQL Server 列(comments 是我的数据框架写入数据库):

conn <- poolCheckout(pool)

DBI::dbWriteTable(conn = conn,
DBI::Id(cluster = "my_projects", schema = "dbo", table = "comments"),
value = comments,
overwrite = TRUE,
row.names = FALSE,
field.types = c(
STRATEGIC_AREA = "varchar(255)",
OBJECTIVE = "varchar(255)",
METHOD_OF_MEASURE = "varchar(MAX)",
TARGET = "float",
UNIT = "varchar(255)",
MIN_MAX = "varchar(255)",
JUL = "varchar(1024)",
AUG = "varchar(MAX)",
SEP = "varchar(MAX)",
OCT = "varchar(MAX)",
NOV = "varchar(MAX)",
DEC = "varchar(MAX)",
JAN = "varchar(MAX)",
FEB = "varchar(MAX)",
MAR = "varchar(MAX)",
APR = "varchar(MAX)",
MAY = "varchar(MAX)",
JUN = "varchar(MAX)"
))
poolReturn(conn)
poolClose(pool)

问题是 SQL Server 数据库忽略了列类,并与 varchar(255) 列一致地创建。 field.types 中的 varchar(1024)varchar(MAX) 等列类将被忽略。该函数似乎遵循数据框列类的映射:

  • 字符 -> varchar(255)
  • 数字 -> float

如果我将 DBI::Id(cluster = "my_projects", schema = "dbo", table = "comments") 替换为 "my_projects. dbo.comments”。如果我引用传递给 field.types 的命名字符向量的左侧也没有关系。

结果,包含长度超过 255 个字符的字符串的数据框行被截断(这些行被跳过),与数据框相比,我最终在 SQL Server 表中得到的行数更少。错误消息(非常长的字符串位于 FEB 列中):

Error in .local(conn, statement, ...) :
execute JDBC update query failed in dbSendUpdate (String or binary data would be truncated in table 'my_projects.dbo.comments', column 'FEB'. Truncated value:

我错过了什么?

更新

根据下面sgoley的更新,我将代码更改如下:

dbCreateTable(
conn = conn,
Id(
database = "my_projects",
schema = "dbo",
table = "comments"
),
fields = c(
STRATEGIC_AREA = "varchar(255)",
OBJECTIVE = "varchar(255)",
METHOD_OF_MEASURE = "varchar(255)",
TARGET = "float",
UNIT = "varchar(255)",
MIN_MAX = "varchar(255)",
JUL = "varchar(MAX)",
AUG = "varchar(MAX)",
SEP = "varchar(MAX)",
OCT = "varchar(MAX)",
NOV = "varchar(MAX)",
DEC = "varchar(MAX)",
JAN = "varchar(MAX)",
FEB = "varchar(MAX)",
MAR = "varchar(MAX)",
APR = "varchar(MAX)",
MAY = "varchar(MAX)",
JUN = "varchar(MAX)"
),
row.names = NULL
)


values <- DBI::sqlAppendTable(con = conn,
table = Id(database = "my_projects", schema = "dbo", table = "comments"),
values = comments,
row.names = FALSE)
RJDBC::dbSendUpdate(conn, values)

这现在可以正常工作,没有任何问题和错误消息。使用 Microsoft SQL Server Management Studio 进行的验证确认列现在是正确的类并遵循 fields 规范。

最佳答案

尝试构建尽可能完整的答案 - 让我们先从 RJDBC 项目开始


RJDBC

repo :https://github.com/s-u/RJDBC

在我看来,Simon 可能在这里用他的驱动程序级别类型映射无意中覆盖了所有显式字段类型声明?

maptypes.R

我说这只是一个非常高层次的推测,因为当涉及到 RJavaDBI< 的交集时,我肯定超出了我的理解范围 但如果鞋子合脚呢?

当我运行类似的东西时:

> dbDataType(jdbc.sql.driver,titanic)
Class Sex Age Survived Freq
"TEXT" "TEXT" "TEXT" "TEXT" "DOUBLE"

驱动程序似乎将这些类型映射到 varchar(255),这就是为什么您会看到所有创建的列都默认设置为该类型。

无论如何。只是猜测原因,继续解决问题。


列和类型

利用上述所有评论和建议,到目前为止,对我来说,在 sql server 中实现这些数据类型的最佳和唯一方法(无需通过 dbExecute super 显式地执行它 你可以) 是:

pool <- dbPool(
drv = jdbc.sql.driver,
url = sql.url,
dname = sql.dname,
schema = sql.schema,
username = sql.username,
password = sql.password
)

conn <- poolCheckout(pool)

dbCreateTable(conn= conn,
Id(database = "my_project",
schema = "dbo",
table = "titanic"),
fields = c( Class="nvarchar(max)",
Sex="varchar(max)",
Age="ntext",
Survived="text",
Freq="integer"
),
row.names = NULL)

产生:

SSMS Table Screenshot


插入数据

然后从那里开始,想法是插入带有类似内容的行:

dbAppendTable(conn= conn,
name= Id(database = "my_project", schema = "dbo", table = "titanic"),
value = titanic)

但是我遇到了以下错误:

Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set",  : 
Unable to retrieve JDBC result set
JDBC ERROR: The value is not set for the parameter number 2.
Statement: INSERT INTO "my_project"."dbo"."titanic"
("Class", "Sex", "Age", "Survived", "Freq")
VALUES
(?, ?, ?, ?, ?)
In addition: Warning message:
In if (is.na(v)) { :
the condition has length > 1 and only the first element will be used

所以现在更好的选择是使用:

values <- DBI::sqlAppendTable(con = conn, table = Id(database = "my_project", schema = "dbo", table = "titanic"), values = titanic)
DBI::dbExecute(conn = conn, values)

产生具有请求的列类型的预期行: Data values with correct column types

关于R DBI SQL Server : dbWriteTable truncates rows/field. 类型参数不起作用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66864660/

36 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com