gpt4 book ai didi

尝试连接到 HDFS 的 Java 抛出 HADOOP_HOME 未设置,找不到 winutils

转载 作者:可可西里 更新时间:2023-11-01 14:51:31 27 4
gpt4 key购买 nike

我正在尝试制作一个应用程序原型(prototype),以将 Hadoop 用作数据存储,但我在第一个障碍上摔倒了。我可以访问 Hadoop 集群,并且从 Spring 中窃取了一个测试样本来尝试第一步:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hdfs.DistributedFileSystem;
import org.junit.jupiter.api.Test;
import java.io.PrintWriter;
import java.net.URI;
import java.util.Scanner;

public class HdfsTest {

@Test
public void testHdfs() throws Exception {

System.setProperty("HADOOP_USER_NAME", "adam");

// Path that we need to create in HDFS.
// Just like Unix/Linux file systems, HDFS file system starts with "/"
final Path path = new Path("/usr/adam/junk.txt");

// Uses try with resources in order to avoid close calls on resources
// Creates anonymous sub class of DistributedFileSystem to allow calling
// initialize as DFS will not be usable otherwise
try (
final DistributedFileSystem dFS
= new DistributedFileSystem() {
{
initialize(new URI(
"hdfs://hanameservice/user/adam"),
new Configuration());
}
};
// Gets output stream for input path using DFS instance
final FSDataOutputStream streamWriter = dFS.create(path);
// Wraps output stream into PrintWriter to use high level
// and sophisticated methods
final PrintWriter writer = new PrintWriter(streamWriter);
) {
// Writes tutorials information to file using print writer
writer.println("bungalow bill");
writer.println("what did you kill");
System.out.println("File Written to HDFS successfully!");
}
}

这些是我正在使用的 Hadoop 库:

    <dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.8.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.8.1</version>
</dependency>

我可以缺少依赖项吗?

这是带有错误的日志记录 - 虽然似乎有 2 个单独的错误。

2017-06-23 16:01:38.787  WARN   --- [           main] org.apache.hadoop.util.Shell             : Did not find winutils.exe: {}

java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset. -see https://wiki.apache.org/hadoop/WindowsProblems
at org.apache.hadoop.util.Shell.fileNotFoundException(Shell.java:528)
at org.apache.hadoop.util.Shell.getHadoopHomeDir(Shell.java:549)
at org.apache.hadoop.util.Shell.getQualifiedBin(Shell.java:572)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:669)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1445)
at org.apache.hadoop.fs.FileSystem.initialize(FileSystem.java:221)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:145)
at com.bp.gis.tardis.HdfsTest$1.<init>(HdfsTest.java:34)
at com.bp.gis.tardis.HdfsTest.testHdfs(HdfsTest.java:31)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:316)
at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:114)
at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.lambda$invokeTestMethod$6(MethodTestDescriptor.java:171)
at org.junit.jupiter.engine.execution.ThrowableCollector.execute(ThrowableCollector.java:40)
at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.invokeTestMethod(MethodTestDescriptor.java:168)
at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.execute(MethodTestDescriptor.java:115)
at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.execute(MethodTestDescriptor.java:57)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.lambda$execute$1(HierarchicalTestExecutor.java:81)
at org.junit.platform.engine.support.hierarchical.SingleTestExecutor.executeSafely(SingleTestExecutor.java:66)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:76)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.lambda$execute$1(HierarchicalTestExecutor.java:91)
at org.junit.platform.engine.support.hierarchical.SingleTestExecutor.executeSafely(SingleTestExecutor.java:66)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:76)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.lambda$execute$1(HierarchicalTestExecutor.java:91)
at org.junit.platform.engine.support.hierarchical.SingleTestExecutor.executeSafely(SingleTestExecutor.java:66)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:76)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:51)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:43)
at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:137)
at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:87)
at org.junit.platform.launcher.Launcher.execute(Launcher.java:93)
at com.intellij.junit5.JUnit5IdeaTestRunner.startRunnerWithArgs(JUnit5IdeaTestRunner.java:61)
at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset.
at org.apache.hadoop.util.Shell.checkHadoopHomeInner(Shell.java:448)
at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:419)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:496)
... 35 common frames omitted

2017-06-23 16:01:39.449 WARN --- [ main] org.apache.hadoop.util.NativeCodeLoader : Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

java.lang.IllegalArgumentException: java.net.UnknownHostException: hanameservice

at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)
at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:130)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:343)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:287)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:156)
at com.bp.gis.tardis.HdfsTest$1.<init>(HdfsTest.java:34)
at com.bp.gis.tardis.HdfsTest.testHdfs(HdfsTest.java:31)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:316)
at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:114)
at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.lambda$invokeTestMethod$6(MethodTestDescriptor.java:171)
at org.junit.jupiter.engine.execution.ThrowableCollector.execute(ThrowableCollector.java:40)
at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.invokeTestMethod(MethodTestDescriptor.java:168)
at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.execute(MethodTestDescriptor.java:115)
at org.junit.jupiter.engine.descriptor.MethodTestDescriptor.execute(MethodTestDescriptor.java:57)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.lambda$execute$1(HierarchicalTestExecutor.java:81)
at org.junit.platform.engine.support.hierarchical.SingleTestExecutor.executeSafely(SingleTestExecutor.java:66)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:76)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.lambda$execute$1(HierarchicalTestExecutor.java:91)
at org.junit.platform.engine.support.hierarchical.SingleTestExecutor.executeSafely(SingleTestExecutor.java:66)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:76)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.lambda$execute$1(HierarchicalTestExecutor.java:91)
at org.junit.platform.engine.support.hierarchical.SingleTestExecutor.executeSafely(SingleTestExecutor.java:66)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:76)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:51)
at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:43)
at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:137)
at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:87)
at org.junit.platform.launcher.Launcher.execute(Launcher.java:93)
at com.intellij.junit5.JUnit5IdeaTestRunner.startRunnerWithArgs(JUnit5IdeaTestRunner.java:61)
at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: java.net.UnknownHostException: hanameservice
... 36 more

我该如何解决这个问题?我尝试连接的 Hadoop 集群的联系人不熟悉 hdfs: 协议(protocol),他们的引用框架似乎都是手动的,而不是程序化的。他们要我登录到边缘节点并在 shell 中运行脚本。我觉得我应该问他们一些特定的问题,但我不确定是什么。

最佳答案

有两个明显的问题:

  1. 看来您是从 Windows 主机上运行的。在 Windows 上,Hadoop 需要 native 代码扩展,以便它可以与操作系统正确集成文件访问语义和权限等内容。请注意,异常消息包含指向 Apache Hadoop 维基页面的链接:WindowsProblems .该页面包含有关如何处理此问题的信息。
  2. 无法建立与主机“hanameservice”的套接字连接。这很可能不是真实姓名,而是用于 HDFS High Availability 的逻辑名称.在内部,HDFS 客户端代码会将此逻辑名称映射到 2 个真实 NameNode 主机名中的 1 个,但前提是配置已完成。您可能没有集群中的一组完整配置文件(core-site.xml 和 hdfs-site.xml)。您需要在本地系统上进行完整配置才能正常工作。

They want me to login to an edge node and run scripts there in a shell.

总的来说,这可能是最短的路径,而不是尝试完成 Windows 集成和配置。如果您将代码包装在 Hadoop Tool 中接口(interface),将其构建为 jar,然后将该 jar 复制到边缘节点,然后您就可以将其作为 hadoop jar your-app.jar 运行。您将在一个已知的工作环境中运行,无需整理 native 代码扩展,也无需担心配置是否完整以及是否与集群配置保持同步。

关于尝试连接到 HDFS 的 Java 抛出 HADOOP_HOME 未设置,找不到 winutils,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44724839/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com