gpt4 book ai didi

java - 在 Hadoop 映射器类中获取输入路径

转载 作者:可可西里 更新时间:2023-11-01 14:50:48 28 4
gpt4 key购买 nike

我在 Hadoop 中实现了一个简单的 MapReduce 项目用于处理日志。输入路径为日志所在目录。

它工作正常,但我想知道在实现映射器的类中,日志的输入路径是如何随时处理的。映射器代码是:

public class StatsMapper extends MapReduceBase implements Mapper<WritableComparable<Text>,Text,Text,Text> { 

public static final Log LOG = LogFactory.getLog(StatsMapper.class);

public void configure(JobConf conf) {}

public void map(WritableComparable<Text> key, Text value, OutputCollector<Text,Text> output, Reporter reporter)
throws IOException {

process(key,value);

}

}

有什么想法吗?

提前致谢

最佳答案

阅读输入格式部分 here

How these input files are split up and read is defined by the InputFormat. An InputFormat is a class that provides the following functionality: Selects the files or other objects that should be used for input Defines the InputSplits that break a file into tasks Provides a factory for RecordReader objects that read the file

关于java - 在 Hadoop 映射器类中获取输入路径,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/5221044/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com