gpt4 book ai didi

java - Hadoop MapReduce输出最大

转载 作者:行者123 更新时间:2023-12-02 22:03:05 24 4
gpt4 key购买 nike

我目前正在使用Eclipse和Hadoop创建一个映射器和化简器,以查找航空公司数据集的最大总成本。
因此,总成本是十进制值,航空公司是文本。

我使用的数据集可在以下网络链接中找到:
https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/236265/dft-flights-data-2011.csv

当我在Hadoop中导出jar文件时,
我收到以下消息:ls:“输出”:没有这样的文件或目录。
谁能帮我更正密码?
我的代码如下。

映射器:

package org.myorg;

import java.io.IOException;

import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class MaxTotalCostMapper extends Mapper<LongWritable, Text, Text, DoubleWritable>
{
private final static DoubleWritable totalcostWritable = new DoubleWritable(0);
private Text AirCarrier = new Text();

@Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException
{
String[] line = value.toString().split(",");
AirCarrier.set(line[8]);
double totalcost = Double.parseDouble(line[2].trim());
totalcostWritable.set(totalcost);
context.write(AirCarrier, totalcostWritable);
}
}

reducer :
package org.myorg;

import java.io.IOException;
import java.util.ArrayList;

import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class MaxTotalCostReducer extends Reducer<Text, DoubleWritable, Text, DoubleWritable>
{
ArrayList<Double> totalcostList = new ArrayList<Double>();

@Override
public void reduce(Text key, Iterable<DoubleWritable> values, Context context)
throws IOException, InterruptedException
{
double maxValue=0.0;
for (DoubleWritable value : values)
{
maxValue = Math.max(maxValue, value.get());
}
context.write(key, new DoubleWritable(maxValue));
}
}

主要:
package org.myorg;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;


public class MaxTotalCost
{
public static void main(String[] args) throws Exception
{
Configuration conf = new Configuration();
if (args.length != 2)
{
System.err.println("Usage: MaxTotalCost<input path><output path>");
System.exit(-1);
}

Job job;
job=Job.getInstance(conf, "Max Total Cost");
job.setJarByClass(MaxTotalCost.class);

FileInputFormat.addInputPath(job, new Path(args[1]));
FileOutputFormat.setOutputPath(job, new Path(args[2]));

job.setMapperClass(MaxTotalCostMapper.class);
job.setReducerClass(MaxTotalCostReducer.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(DoubleWritable.class);

System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}

最佳答案

ls: "output" : No such file or directory



您没有HDFS用户目录。您的代码没有将其放入Mapper或Reducer中。该错误通常发生在工作上
  FileOutputFormat.setOutputPath(job, new Path(args[2]));

运行 hdfs dfs -ls,看看是否有任何错误。如果是这样,请在 /user下创建一个与您当前用户匹配的目录。

否则,将输出目录更改为 /tmp/max

关于java - Hadoop MapReduce输出最大,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48096538/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com