gpt4 book ai didi

r - 在sparklyr中使用spark_read_csv报错 "Invalid method csv for object"

转载 作者:可可西里 更新时间:2023-11-01 16:39:33 26 4
gpt4 key购买 nike

我正在尝试从 hdfs 读取 R 中的数据。在使用 sparklyr 时,我遇到的一件事是破译错误消息……因为我不是 Java 程序员。

考虑这个例子:

在 R 中执行此操作

创建鲍鱼数据框 - 鲍鱼是用于机器学习示例的数据集

load pivotal R package #contains abalone data and create dataframe
if (!require(PivotalR)){
install.packages(PivotalR) }

data(abalone)

#sample of data
head(abalone)

#export data to a CSV file
if (!require(readr)){
install.packages(readr) }
write_csv(abalone,'abalone.csv')
在命令行执行此操作
hdfs dfs -put abalone.csv abalone.csv
#check to see if the file is on the hdfs
hdfs dfs -ls

在 R 中执行此操作这设置为使用您当前版本的 spark你可能需要改变 spark_home

  library(sparklyr)
library(SparkR)
sc = spark_connect(master = 'yarn-client',
spark_home = '/usr/hdp/current/spark-client',
app_name = 'sparklyr',
config = list(
"sparklyr.shell.executor-memory" = "1G",
"sparklyr.shell.driver-memory" = "4G",
"spark.driver.maxResultSize" = "2G" # may need to transfer a lot of data into R
)
)

读入我们刚刚写入HDFS的鲍鱼文件。您将必须更改路径以匹配您的路径。

df <- spark_read_csv(sc,name='abalone',path='hdfs://pnhadoop/user/stc004/abalone.csv',delimiter=",",
header=TRUE)

我收到以下错误:

Error: java.lang.IllegalArgumentException: invalid method csv for object 63
at sparklyr.Invoke$.invoke(invoke.scala:113)
at sparklyr.StreamHandler$.handleMethodCall(stream.scala:89)
at sparklyr.StreamHandler$.read(stream.scala:55)
at sparklyr.BackendHandler.channelRead0(handler.scala:49)
at sparklyr.BackendHandler.channelRead0(handler.scala:14)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
at java.lang.Thread.run(Thread.java:745)

不知道发生了什么。我以前使用过 spark_read_csv 没有错误。我不知道如何破译 java 错误。想法?

最佳答案

星火 2.1.0

sparkR.session( sparkConfig = list(),enableHiveSupport= FALSE)
df1 <- read.df(path="hdfs://<yourpath>/*",source="csv",na.strings = "NA", delimiter="\u0001")
head(df1)

关于r - 在sparklyr中使用spark_read_csv报错 "Invalid method csv for object",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44285825/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com