gpt4 book ai didi

java - 为什么在我设置 map 压缩属性后 hadoop 就卡在那里了?

转载 作者:可可西里 更新时间:2023-11-01 15:38:42 27 4
gpt4 key购买 nike

下面是运行良好的代码片段:

Configuration conf = new Configuration();

//PROBLEM PART!!!!!
//conf.setBoolean("mapred.compress.map.output", true);
//conf.set("mapred.output.compression.type", "BLOCK");
//conf.setClass("mapred.map.output.compression.codec", GzipCodec.class, CompressionCodec.class);

Job job = new Job(conf, "WordCount");
job.setJarByClass(WordCount.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

job.setMapperClass(WordCountMap.class);
job.setReducerClass(Reduce.class);

job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);

FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

job.waitForCompletion(true);

但是如果我在上面的代码片段中启用问题部分,控制台的输出将停留在:

13/12/26 18:08:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/12/26 18:08:06 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/12/26 18:08:06 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
13/12/26 18:08:06 INFO input.FileInputFormat: Total input paths to process : 20
13/12/26 18:08:06 WARN snappy.LoadSnappy: Snappy native library not loaded
13/12/26 18:08:06 INFO mapred.JobClient: Running job: job_local1943436108_0001
13/12/26 18:08:06 INFO mapred.LocalJobRunner: Waiting for map tasks
13/12/26 18:08:06 INFO mapred.LocalJobRunner: Starting task: attempt_local1943436108_0001_m_000000_0
13/12/26 18:08:07 INFO util.ProcessTree: setsid exited with exit code 0
13/12/26 18:08:07 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@731d2572
13/12/26 18:08:07 INFO mapred.MapTask: Processing split: hdfs://localhost:9000/user/jude/input/capacity-scheduler.xml:0+7457
13/12/26 18:08:07 INFO mapred.MapTask: io.sort.mb = 100
13/12/26 18:08:07 INFO mapred.MapTask: data buffer = 79691776/99614720
13/12/26 18:08:07 INFO mapred.MapTask: record buffer = 262144/327680
13/12/26 18:08:07 INFO mapred.MapTask: Starting flush of map output
13/12/26 18:08:07 INFO compress.CodecPool: Got brand-new compressor
13/12/26 18:08:07 INFO mapred.MapTask: Starting flush of map output
13/12/26 18:08:07 INFO mapred.JobClient: map 0% reduce 0%
13/12/26 18:08:12 INFO mapred.LocalJobRunner:
13/12/26 18:08:13 INFO mapred.JobClient: map 5% reduce 0%
//no more

我只是想压缩map的输出,我的代码有什么问题吗?非常感谢!

最佳答案

使用压缩需要 Hadoop 使用适用于您的平台的 native 库,但显然您没有它们(或者没有正确配置库的路径)。这是解释问题的消息:

[...] NativeCodeLoader: Unable to load native-hadoop library for your platform... 

可能的解决方案:

关于java - 为什么在我设置 map 压缩属性后 hadoop 就卡在那里了?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20783020/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com