gpt4 book ai didi

r - 如何将 odbc 包安装到 Databricks 集群?

转载 作者:行者123 更新时间:2023-12-04 15:46:31 25 4
gpt4 key购买 nike

我需要从 Databricks 中的 R 笔记本访问 Azure SQL 数据库。为此,我打算使用 odbc 包,它在我的本地 R 实例上安装得很好。

我尝试使用 Databricks 的界面将包安装到集群,但总是失败。我还在笔记本中尝试了以下代码:

install.packages("odbc")

结果是:

Installing package into ‘/databricks/spark/R/lib’
(as ‘lib’ is unspecified)
trying URL 'https://cloud.r-project.org/src/contrib/odbc_1.1.6.tar.gz'
Content type 'application/x-gzip' length 288033 bytes (281 KB)
==================================================
downloaded 281 KB

* installing *source* package ‘odbc’ ...
** package ‘odbc’ successfully unpacked and MD5 sums checked
PKG_CFLAGS=
PKG_LIBS=-lodbc
<stdin>:1:17: fatal error: sql.h: No such file or directory
compilation terminated.
------------------------- ANTICONF ERROR ---------------------------
Configuration failed because odbc was not found. Try installing:
* deb: unixodbc-dev (Debian, Ubuntu, etc)
* rpm: unixODBC-devel (Fedora, CentOS, RHEL)
* csw: unixodbc_dev (Solaris)
* brew: unixodbc (Mac OSX)
To use a custom odbc set INCLUDE_DIR and LIB_DIR manually via:
R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...'
--------------------------------------------------------------------
ERROR: configuration failed for package ‘odbc’
* removing ‘/databricks/spark/R/lib/odbc’

The downloaded source packages are in
‘/tmp/RtmpqHp2QM/downloaded_packages’

我也尝试过从 github 安装:

library(devtools)
devtools::install_github("r-dbi/odbc")

这给出了不同的错误:

Downloading GitHub repo r-dbi/odbc@master
Installing 3 packages: assertthat, BH, Rcpp
Installing packages into ‘/databricks/spark/R/lib’
(as ‘lib’ is unspecified)
trying URL 'https://cloud.r-project.org/src/contrib/assertthat_0.2.1.tar.gz'
Content type 'application/x-gzip' length 12742 bytes (12 KB)
==================================================
downloaded 12 KB

trying URL 'https://cloud.r-project.org/src/contrib/BH_1.69.0-1.tar.gz'
Content type 'application/x-gzip' length 12378154 bytes (11.8 MB)
==================================================
downloaded 11.8 MB

trying URL 'https://cloud.r-project.org/src/contrib/Rcpp_1.0.1.tar.gz'
Content type 'application/x-gzip' length 3661123 bytes (3.5 MB)
==================================================
downloaded 3.5 MB

* installing *source* package ‘assertthat’ ...
** package ‘assertthat’ successfully unpacked and MD5 sums checked
** R
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (assertthat)
* installing *source* package ‘BH’ ...
** package ‘BH’ successfully unpacked and MD5 sums checked
** inst
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (BH)
* installing *source* package ‘Rcpp’ ...
** package ‘Rcpp’ successfully unpacked and MD5 sums checked
** libs
g++ -I/usr/share/R/include -DNDEBUG -I../inst/include/ -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c Date.cpp -o Date.o
g++ -I/usr/share/R/include -DNDEBUG -I../inst/include/ -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c Module.cpp -o Module.o
g++ -I/usr/share/R/include -DNDEBUG -I../inst/include/ -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c Rcpp_init.cpp -o Rcpp_init.o
g++ -I/usr/share/R/include -DNDEBUG -I../inst/include/ -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c api.cpp -o api.o
g++ -I/usr/share/R/include -DNDEBUG -I../inst/include/ -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c attributes.cpp -o attributes.o
g++ -I/usr/share/R/include -DNDEBUG -I../inst/include/ -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c barrier.cpp -o barrier.o
g++ -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o Rcpp.so Date.o Module.o Rcpp_init.o api.o attributes.o barrier.o -L/usr/lib/R/lib -lR
installing to /databricks/spark/R/lib/Rcpp/libs
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
* DONE (Rcpp)

The downloaded source packages are in
‘/tmp/RtmpqHp2QM/downloaded_packages’
Error in processx::run(bin, args = real_cmdargs, stdout_line_callback = real_callback(stdout), :
System command error
In addition: Warning messages:
1: In install.packages("odbc") :
installation of package ‘odbc’ had non-zero exit status
2: In install.packages("odbc") :
installation of package ‘odbc’ had non-zero exit status

知道为什么这个包在本地工作正常时无法安装在 Databricks 上,而当我尝试安装在 Databricks 上的所有其他包使用相同的语法都可以正常工作时吗?

最佳答案

访问 SQL 数据库的最佳选择是使用预安装的 JDBC 连接(请参阅 Documentation)。如果你想使用 ODBC,这需要(如评论之一所述)unix odbc。安装多个包的好习惯是使用 init-scripts .以下 python 代码用于为 pyodbc 安装创建初始化脚本。

script = """
sudo apt-get -q -y install unixodbc unixodbc-dev
sudo apt-get -q -y install python3-dev
sudo pip install pyodbc
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
sudo curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get -q -y install msodbcsql
"""

dbutils.fs.put("/databricks/init/pyodbc/pyodbc.sh", script, True)

希望这对您有所帮助。

关于r - 如何将 odbc 包安装到 Databricks 集群?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55517978/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com