gpt4 book ai didi

java - 向 reducer-MapReduce 发送多个参数

转载 作者:可可西里 更新时间:2023-11-01 14:50:29 24 4
gpt4 key购买 nike

我编写了一个代码,它执行类似于 SQL GroupBy 的操作。

我拿的数据集在这里:


250788681419,20090906,200937,200909,619,SUNDAY,WEEKEND,ON-NET,MORNING,OUTGOING,VOICE,25078,PAY_AS_YOU_GO_PER_SECOND_PSB,SUCCESSFUL-RELEASEDBYSERVICE,17,0,1,21.25-10-1452-1452-17


public class MyMap extends Mapper<LongWritable, Text, Text, DoubleWritable> {

public void map(LongWritable key, Text value, Context context) throws IOException
{

String line = value.toString();
String[] attribute=line.split(",");
double rs=Double.parseDouble(attribute[17]);

String comb=new String();
comb=attribute[5].concat(attribute[8].concat(attribute[10]));

context.write(new Text(comb),new DoubleWritable (rs));

}
}
public class MyReduce extends Reducer<Text, DoubleWritable, Text, DoubleWritable> {

protected void reduce(Text key, Iterator<DoubleWritable> values, Context context)
throws IOException, InterruptedException {

double sum = 0;
Iterator<DoubleWritable> iter=values.iterator();
while (iter.hasNext())
{
double val=iter.next().get();
sum = sum+ val;
}
context.write(key, new DoubleWritable(sum));
};
}

在 Mapper 中,作为其值将第 17 个参数发送到 reducer 以求和。现在我还想总结第 14 个参数,我如何将它发送到 reducer ?

最佳答案

如果您的数据类型相同,那么创建一个 ArrayWritable 类应该可以解决这个问题。该类应类似于:

public class DblArrayWritable extends ArrayWritable 
{
public DblArrayWritable()
{
super(DoubleWritable.class);
}
}

你的映射器类看起来像:

public class MyMap extends Mapper<LongWritable, Text, Text, DblArrayWritable> 
{
public void map(LongWritable key, Text value, Context context) throws IOException
{

String line = value.toString();
String[] attribute=line.split(",");
DoubleWritable[] values = new DoubleWritable[2];
values[0] = Double.parseDouble(attribute[14]);
values[1] = Double.parseDouble(attribute[17]);

String comb=new String();
comb=attribute[5].concat(attribute[8].concat(attribute[10]));

context.write(new Text(comb),new DblArrayWritable.set(values));

}
}

在您的 reducer 中,您现在应该能够迭代 DblArrayWritable 的值。

根据您的示例数据,但看起来它们可能是不同的类型。您可能能够实现一个 ObjectArrayWritable 类来实现这一点,但我不确定这一点,而且我看不出有太多支持它的地方。如果可行,该类将是:

public class ObjArrayWritable extends ArrayWritable 
{
public ObjArrayWritable()
{
super(Object.class);
}
}

您可以通过简单地连接值并将它们作为文本传递给缩减器来处理这个问题,缩减器随后会再次拆分它们。

另一种选择是实现您自己的 Writable 类。这是它如何工作的示例:

public static class PairWritable implements Writable 
{
private Double myDouble;
private String myString;

// TODO :- Override the Hadoop serialization/Writable interface methods
@Override
public void readFields(DataInput in) throws IOException {
myLong = in.readDouble();
myString = in.readUTF();
}

@Override
public void write(DataOutput out) throws IOException {
out.writeDouble(myLong);
out.writeUTF(myString);
}

//End of Implementation

//Getter and Setter methods for myLong and mySring variables
public void set(Double d, String s) {
myDouble = d;
myString = s;
}

public Long getLong() {
return myDouble;
}
public String getString() {
return myString;
}

}

关于java - 向 reducer-MapReduce 发送多个参数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14516029/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com