hadoop - 使用 DistributedCache 访问 MapFile 时出现 FileNotFoundException-6ren

hadoop - 使用 DistributedCache 访问 MapFile 时出现 FileNotFoundException

转载作者：行者123 更新时间：2023-12-02 21:46:32

24

4

我正在使用以 yarn 模式运行的 hadoop cdf4.7。 hdfs://test1:9100/user/tagdict_builder_output/part-00000 中有一个 MapFile
它有两个文件index和 data
我使用以下代码将其添加到分布式缓存:

Configuration conf = new Configuration();
Path tagDictFilePath = new Path("hdfs://test1:9100/user/tagdict_builder_output/part-00000");
DistributedCache.addCacheFile(tagDictFilePath.toUri(), conf);
Job job = new Job(conf);

并在 Mapper 的设置中初始化一个 MapFile.Reader:

        @Override
        protected void setup(Context context) throws IOException, InterruptedException {



            Path[] localFiles = DistributedCache.getLocalCacheFiles(context.getConfiguration());
            if (localFiles != null && localFiles.length > 0 && localFiles[0] != null) {
                String mapFileDir = localFiles[0].toString();
                LOG.info("mapFileDir " + mapFileDir);
                FileSystem fs = FileSystem.get(context.getConfiguration());
                reader = new MapFile.Reader(fs, mapFileDir, context.getConfiguration());
            }
            else {
                throw new IOException("Could not read lexicon file in DistributedCache");
            }
}

但它会抛出 FileNotFoundException:

Error: java.io.FileNotFoundException: File does not exist: /home/mps/cdh/local/usercache/mps/appcache/application_1405497023620_0045/container_1405497023620_0045_01_000012/part-00000/data
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:824)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1704)
        at org.apache.hadoop.io.MapFile$Reader.createDataFileReader(MapFile.java:452)
        at org.apache.hadoop.io.MapFile$Reader.open(MapFile.java:426)
        at org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:396)
        at org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:405)
        at aps.Cdh4MD5TaglistPreprocessor$Vectorizer.setup(Cdh4MD5TaglistPreprocessor.java:61)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:338)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:160)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:155)

我也试过 /user/tagdict_builder_output/part-00000作为路径，或使用符号链接(symbolic link)。但是这些也不起作用。如何解决这个问题？非常感谢。

最佳答案

正如它所说 here :

Distributed Cache associates the cache files to the current working directory of the mapper and reducer using symlinks.

所以你应该尝试通过 File 访问你的文件。目的:

File f = new File("./part-00000");

编辑1

我最后的建议:

DistributedCache.addCacheFile(new URI(tagDictFilePath.toString() + "#cache-file"), conf);
DistributedCache.createSymlink(conf);
...
// in mapper
File f = new File("cache-file");

关于hadoop - 使用 DistributedCache 访问 MapFile 时出现 FileNotFoundException，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/24866439/

24

4

0

文章推荐： hadoop - 如何在没有CDH安装的情况下安装Cloudera Manager？

文章推荐： networking - Ambari 1.6.1 hadoop端口8088没有网站

filenotfoundexception - System.IO.FileNotFoundException。我在哪里可以找到错误的路径？
我在 VisualStudio2010 中创建了小的 Windows Forms progs，只是为了业余爱好。释放它们后，我使用 .exe 文件在其他 PC 上运行它们，而无需进行任何安装。这些
scala - SBT项目java.io.FileNotFoundException:FileNotFoundException:HADOOP_HOME未设置
我正在尝试使用AvroParquetWriter将Avro格式的文件转换为 Parquet 文件。我加载架构 val schema:org.apache.Schema = ... getSchema(
写图像时的java FileNotFoundException
我正在尝试使用 Image.IO.Write() 保存图像；我基本上从 here 复制了标准代码使用 lwjgl 截取屏幕截图。我唯一做的就是使用现有目录作为保存路径来初始化文件。当我尝试保存图像时
安卓 - FileNotFoundException
错误: E/BitmapFactory﹕ Unable to decode stream: java.io.FileNotFoundException: /file:/storage/sdcard0/
dataframe - Databricks - FileNotFoundException
如果这是基本的，我很抱歉，我错过了一些简单的东西。我正在尝试运行下面的代码以遍历文件夹中的文件并将所有以特定字符串开头的文件合并到数据框中。所有文件都放在一个湖中。 file_list=[] path
android - 无法解码流 : FileNotFoundException
在我的数据库中，我的手机上的每个条目都有一个图片的 uri。为了在手机上显示它，Listview 有一个 CustomAdapter。现在，我想在 ListView 中显示图片，得到如下错误信息: 0
cassandra - 压缩期间的 FileNotFoundException
我的所有节点在压缩期间都抛出 FileNotFoundException。因此，没有一个压缩(自动、手动)可以完成，我的 SSTable 计数现在是单个 CF (CQL3) 的数千个。 nodetoo
java - 未处理的异常 : FileNotFoundException
我在 java 中读取文件时遇到一些问题: 我的文件是例如: 3,4 2 6 4 1 7 3 8 9 其中第一行 3 和 4 是数组 A 和 B 的长度，然后是每个数组的元素。我做的 import
java - FileNotFoundException，但是文件存在于同一级别
我创建了一个程序，其中保存了学生的成绩，我想将这些成绩存储在txt文件中，然后在启动程序时，导入成绩，并在程序完成后导出成绩。我将import和exportTo方法放在单独的文件中，然后在主类中调用这
exception - 如何捕获人脸 FileNotFoundException？
我怎样才能捕获一个 com.sun.faces.context.FacesFileNotFoundException 在 Java EE 网络应用程序中？我尝试在我的 web.xml 文件中添加以下
java - 读取下载的文本文件 - FileNotFoundException
请帮忙，我正在尝试从此谷歌翻译 API URL 获取数据仅当值为 1 个单词时它才有效。如果值为 2，则会出现错误。我的意思是这个值会起作用: String sourceLang = "auto";
使用retrofit2上传图片时出现Java.io.FileNotFoundException
当我尝试使用retrofit2上传图片时，出现此错误 :java.io.FileNotFoundException(No such file or directory). HashMap partMa
java - FileNotFoundException 但文件确实存在
try { FileReader fr = new FileReader("C:\\Users\\kevin\\Desktop\\AndroidLibr\\LeagueStats\\a
java.io.FileNotFoundException :
我尝试使用 Java 将单个文件从源复制到目标，但收到以下错误消息。 java.io.FileNotFoundException:以下是方法 public void copy_single(Strin
java - 文件与程序位于同一文件夹中时出现 FileNotFoundException
类似的问题涉及 C: 上的文件。驱动器，其中对文件路径进行硬编码是可接受的答案。此应用程序是移动应用程序，对文件路径进行硬编码并不实用。我正在尝试通过扫描仪导入一个文本文件，其中包含一个字符串列表，
java - 如何处理 FileNotFoundException？
我正在修改一个小应用程序以从文件中读取一些数字。到目前为止一切都运行良好，但现在我遇到了一个问题，我不知道如何有效地解决它。如果用户输入了错误的文件名(可能是无意的)，JVM 将抛出 FileNotF
java - 获取 FileNotFoundException
我有一个 Web 项目，其中使用以下代码: try { br1 = new BufferedReader(new FileReader("queryWords.txt")); } catch
java - 使用绝对路径读取文件时出现 FileNotFoundException
我尝试使用绝对路径从文件系统读取文件，但由于“FileNotFoundException”而失败，我不知道为什么 File file=new File("E:\\Directory\\File.txt
Java FileNotFoundException 不工作
在我当前的项目中，我遇到了未收到文件未找到异常的问题。我的驱动程序文件将要打开的路径传递给正在构建图书库的构造函数。我正在使用 JFileChooser 来获取路径。在尝试强制错误(输入不存在的文件名
java - 即使文件存在 FileNotFoundException
这个问题已经有答案了: Java: Unresolved compilation problem (10 个回答) 已关闭 4 年前。我已经查看了有关此问题的其他答案，并尝试了他们的建议，但没有成功

首页

博学

6Ren·AI

商城

hadoop - 使用 DistributedCache 访问 MapFile 时出现 FileNotFoundException