gpt4 book ai didi

scala - SBT项目java.io.FileNotFoundException:FileNotFoundException:HADOOP_HOME未设置

转载 作者:行者123 更新时间:2023-12-02 20:42:43 26 4
gpt4 key购买 nike

我正在尝试使用AvroParquetWriter将Avro格式的文件转换为 Parquet 文件。我加载架构

val schema:org.apache.Schema = ... getSchema(...)
val parquetFile = new Path("Location/for/parquetFile.txt")
val writer = new AvroParquetWriter[GenericRecord](parquetFile,schema)

我的代码运行良好,直到初始化AvroParquetWriter。然后抛出此错误:
> java.lang.RuntimeException: java.io.FileNotFoundException:
> java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are
> unset. -see https://wiki.apache.org/hadoop/WindowsProblems at
> org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:722) at
> org.apache.hadoop.util.Shell.getSetPermissionCommand(Shell.java:256)
> at
> org.apache.hadoop.util.Shell.getSetPermissionCommand(Shell.java:273)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:767)
> at
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:235)...etc

如果您在计算机上运行Hadoop集群,它似乎提供的建议以及我正在发现的建议都与如何解决此问题有关。但是,我没有运行Hadoop集群,也没有打算这样做。我已经在SBT文件中导入了其某些库以与程序的其他各个部分一起使用,但是这不会启动本地集群。

它只是开始这样做。在我的另外2个同事中,一个可以无问题地运行它,而另一个刚刚开始遇到与我相同的问题。这是我的build.sbt(的相关部分):
lazy val root = (project in file("."))
.settings(
commonSettings,
name := "My project",
version := "0.1",
libraryDependencies ++= Seq(
"org.apache.hadoop" % "hadoop-common" % "2.9.0",
"com.typesafe.akka" %% "akka-actor" % "2.5.2",
"com.lightbend.akka" %% "akka-stream-alpakka-s3" % "0.9",
"com.enragedginger" % "akka-quartz-scheduler_2.12" % "1.6.0-akka-2.4.x",
"com.typesafe.akka" % "akka-agent_2.12" % "2.5.2",
"com.typesafe.akka" % "akka-remote_2.12" % "2.5.2",
"com.typesafe.akka" % "akka-stream_2.12" % "2.5.2",
"org.apache.kafka" % "kafka-clients" % "0.10.2.1",
"com.typesafe.akka" %% "akka-stream-kafka" % "0.16",
"com.typesafe.akka" %% "akka-persistence" % "2.5.2",
"org.iq80.leveldb" % "leveldb" % "0.7",
"org.fusesource.leveldbjni" % "leveldbjni-all" % "1.8",
"javax.mail" % "javax.mail-api" % "1.5.6",
"com.sun.mail" % "javax.mail" % "1.5.6",
"commons-io" % "commons-io" % "2.5",
"org.apache.avro" % "avro" % "1.8.1",
"net.liftweb" % "lift-json_2.12" % "3.1.0-M1",
"com.google.code.gson" % "gson" % "2.8.1",
"org.json4s" %% "json4s-jackson" % "3.5.2",
"com.amazonaws" % "aws-java-sdk-s3" % "1.11.149",
//"com.amazonaws" % "aws-java-sdk" % "1.11.286",
"org.scalikejdbc" %% "scalikejdbc" % "3.0.0",
"org.scalikejdbc" %% "scalikejdbc-config" % "3.0.0",
"org.scalikejdbc" % "scalikejdbc-interpolation_2.12" % "3.0.2",
"com.microsoft.sqlserver" % "mssql-jdbc" % "6.1.0.jre8",
"org.apache.commons" % "commons-pool2" % "2.4.2",
"commons-pool" % "commons-pool" % "1.6",
"com.jcraft" % "jsch" % "0.1.54",
"ch.qos.logback" % "logback-classic" % "1.2.3",
"com.typesafe.scala-logging" %% "scala-logging" % "3.7.2",
"org.scalactic" %% "scalactic" % "3.0.4",
"mysql" % "mysql-connector-java" % "8.0.8-dmr",
"org.scalatest" %% "scalatest" % "3.0.4" % "test"
)
)

关于为什么它不能运行与Hadoop相关的依赖关系的任何想法?

最佳答案

答案是遵循他们的建议-

  • 我从以下位置下载了最新版本的winutils.exe
    https://github.com/steveloughran/winutils/tree/master/hadoop-3.0.0/bin
  • 然后我在C:/Users/MyName/Hadoop/bin中手动创建了此目录结构-注意,bin必须存在。实际上,您可以随意调用Hadoop /目录,但是bin/必须在其中一级。
  • 我将winutils.exe放入了垃圾箱。
  • 在我的代码中,我必须将这一行放在初始化 Parquet 编写器的上方(我想它在初始化之前随时都可以)来设置Hadoop行:

  • --
    System.setProperty("hadoop.home.dir", "C:/Users/nhanak/Hadoop/")
    val writer = new AvroParquetWriter[GenericRecord](parquetFile,iInfo.schema)
  • 可选-如果您只想将其保留在项目中而不将其转移到本地计算机上,或者其他人将要提取此存储库,或者您想将其包装在jar中以发送到其他地方,等等。 -在您的项目中创建目录结构,并将winutils.exe存储在其中。
    -因此,假设您在项目中创建了目录结构src/main/resources/HadoopResources/bin,请将winutils.exe放入垃圾箱中。然后,要使用winutils.exe,您需要像这样设置Hadoop主页:

  • --
     val file = new File("src/main/resources/HadoopResources")
    System.setProperty("hadoop.home.dir", file.getAbsolutePath)

    关于scala - SBT项目java.io.FileNotFoundException:FileNotFoundException:HADOOP_HOME未设置,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49245348/

    26 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com