gpt4 book ai didi

hadoop - 非云环境中的 GCS 连接器

转载 作者:行者123 更新时间:2023-12-02 18:31:16 33 4
gpt4 key购买 nike

我已经安装了 hadoop 3 版本的 GCS 连接器,并将以下配置添加到 core-site.xml,如 Install.md 中所述.目的是将数据从本地集群中的 hdfs 迁移到云存储。
核心站点.xml

fs.gs.project.id=<project-id>
fs.gs.impl=com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem
fs.AbstractFileSystem.gs.impl=com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS
google.cloud.auth.service.account.enable=true
google.cloud.auth.service.account.json.keyfile=<path to key file>
重新启动服务。
当我尝试访问云中的存储桶以列出文件时,它失败了。
 hdfs --loglevel TRACE dfs -ls gs://data-store/
    20/08/17 15:44:09 DEBUG gcs.GoogleHadoopFileSystemBase: GHFS version: hadoop3-2.1.4
20/08/17 15:44:09 DEBUG fs.FileSystem: gs:// = class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem from /usr/hdp/3.0.0.0-1634/hadoop/lib/gcs-connector-hadoop3-latest.jar
20/08/17 15:44:09 DEBUG fs.FileSystem: file:// = class org.apache.hadoop.fs.LocalFileSystem from /usr/hdp/3.0.0.0-1634/hadoop/hadoop-common-3.1.0.3.0.0.0-1634.jar
20/08/17 15:44:09 DEBUG fs.FileSystem: viewfs:// = class org.apache.hadoop.fs.viewfs.ViewFileSystem from /usr/hdp/3.0.0.0-1634/hadoop/hadoop-common-3.1.0.3.0.0.0-1634.jar
20/08/17 15:44:09 DEBUG fs.FileSystem: har:// = class org.apache.hadoop.fs.HarFileSystem from /usr/hdp/3.0.0.0-1634/hadoop/hadoop-common-3.1.0.3.0.0.0-1634.jar
20/08/17 15:44:09 DEBUG fs.FileSystem: http:// = class org.apache.hadoop.fs.http.HttpFileSystem from /usr/hdp/3.0.0.0-1634/hadoop/hadoop-common-3.1.0.3.0.0.0-1634.jar
20/08/17 15:44:09 DEBUG fs.FileSystem: https:// = class org.apache.hadoop.fs.http.HttpsFileSystem from /usr/hdp/3.0.0.0-1634/hadoop/hadoop-common-3.1.0.3.0.0.0-1634.jar
20/08/17 15:44:09 DEBUG fs.FileSystem: hdfs:// = class org.apache.hadoop.hdfs.DistributedFileSystem from /usr/hdp/3.0.0.0-1634/hadoop-hdfs/hadoop-hdfs-client-3.1.0.3.0.0.0-1634.jar
20/08/17 15:44:09 DEBUG fs.FileSystem: webhdfs:// = class org.apache.hadoop.hdfs.web.WebHdfsFileSystem from /usr/hdp/3.0.0.0-1634/hadoop-hdfs/hadoop-hdfs-client-3.1.0.3.0.0.0-1634.jar
20/08/17 15:44:09 DEBUG fs.FileSystem: swebhdfs:// = class org.apache.hadoop.hdfs.web.SWebHdfsFileSystem from /usr/hdp/3.0.0.0-1634/hadoop-hdfs/hadoop-hdfs-client-3.1.0.3.0.0.0-1634.jar
20/08/17 15:44:09 DEBUG fs.FileSystem: s3n:// = class org.apache.hadoop.fs.s3native.NativeS3FileSystem from /usr/hdp/3.0.0.0-1634/hadoop-mapreduce/hadoop-aws-3.1.0.3.0.0.0-1634.jar
20/08/17 15:44:09 DEBUG fs.FileSystem: Looking for FS supporting gs
20/08/17 15:44:09 DEBUG fs.FileSystem: looking for configuration option fs.gs.impl
20/08/17 15:44:09 DEBUG fs.FileSystem: Filesystem gs defined in configuration option
20/08/17 15:44:09 DEBUG fs.FileSystem: FS for gs is class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem
20/08/17 15:44:09 DEBUG gcs.GoogleHadoopFileSystemBase: initialize(path: gs://data-store/, config: Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, initSuperclass: true)
20/08/17 15:44:09 DEBUG gcs.GoogleHadoopFileSystemBase: initializeDelegationTokenSupport(config: Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, path: gs://data-store/)
20/08/17 15:44:09 TRACE gcs.GoogleHadoopFileSystemBase: Failed to initialize delegation token support
java.lang.IllegalStateException: Delegation Tokens are not configured
at com.google.cloud.hadoop.repackaged.gcs.com.google.common.base.Preconditions.checkState(Preconditions.java:508)
at com.google.cloud.hadoop.fs.gcs.auth.GcsDelegationTokens.init(GcsDelegationTokens.java:65)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.initializeDelegationTokenSupport(GoogleHadoopFileSystemBase.java:578)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.initialize(GoogleHadoopFileSystemBase.java:555)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.initialize(GoogleHadoopFileSystemBase.java:510)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3354)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325)
at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:249)
at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:232)
at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:104)
at org.apache.hadoop.fs.shell.Command.run(Command.java:176)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
20/08/17 15:44:09 DEBUG gcs.GoogleHadoopFileSystemBase: GHFS_ID=GHFS/hadoop3-2.1.4: configure(config: Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml)

不确定我是否错过了有关配置的任何内容。集群是 kerberoized 并且有一个有效的 kerberos 票证(不确定它在这种情况下是否有任何相关性。)
配置中是否缺少任何东西?有什么建议么 ?

最佳答案

关于 Delegation Tokens are not configured 的堆栈跟踪实际上是一条红鲱鱼。如果您阅读 GCS 连接器代码 here ,您将看到连接器将始终尝试配置委托(delegate) token 支持,但如果您未通过 fs.gs.delegation.token.binding 指定绑定(bind)配置将失败,但您在跟踪中看到的异常被吞没了。
现在关于您的命令失败的原因,我想知道您的配置文件中是否有错字:

google.cloud.auth.service.account.enable-true
-而不是 = ?或者这只是一个复制粘贴错误?

关于hadoop - 非云环境中的 GCS 连接器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63452600/

33 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com