gpt4 book ai didi

apache-spark - Zeppelin无法读取本地文件系统的文件路径

转载 作者:行者123 更新时间:2023-12-02 20:03:54 25 4
gpt4 key购买 nike

我已经在Windows系统上用Docker安装了Zeppelin,现在我正在尝试使用本地文件运行Zeppelin Tutorial中定义的代码,但是它抛出错误-

java.net.URISyntaxException: Expected scheme-specific part at index 2: C:
at java.net.URI$Parser.fail(URI.java:2848)
at java.net.URI$Parser.failExpecting(URI.java:2854)
at java.net.URI$Parser.parse(URI.java:3057)
at java.net.URI.<init>(URI.java:746)
at org.apache.hadoop.fs.Path.initialize(Path.java:203)
at org.apache.hadoop.fs.Path.<init>(Path.java:172)
at org.apache.hadoop.fs.Path.<init>(Path.java:94)
at org.apache.hadoop.fs.Globber.glob(Globber.java:201)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1643)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:222)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:270)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:194)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)

这是我正在使用的路径-

file:///C:/xampp/htdocs/bank/bank-full.csv



码-
val bankText = sc.textFile("file:///C:/xampp/htdocs/bank/bank-full.csv")

case class Bank(age:Integer, job:String, marital : String, education : String, balance : Integer)

val bank = bankText.map(s=>s.split(";")).filter(s=>s(0)!="\"age\"").map(
s=>Bank(s(0).toInt,
s(1).replaceAll("\"", ""),
s(2).replaceAll("\"", ""),
s(3).replaceAll("\"", ""),
s(5).replaceAll("\"", "").toInt
)
)

bank.toDF().registerTempTable("bank")
%sql select * from bank

请帮帮我。

提前致谢!

最佳答案

使用file:///xampp/htdocs/bank/bank-full.csv
并确保您的程序也位于C驱动程序中。

关于apache-spark - Zeppelin无法读取本地文件系统的文件路径,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51476318/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com