gpt4 book ai didi

hadoop - java写的hive udf、udaf、udtfs在eclipse这样的ide中怎么调试?

转载 作者:可可西里 更新时间:2023-11-01 14:48:52 30 4
gpt4 key购买 nike

例如,对于调试 pig udfs,这是可行的:http://ben-tech.blogspot.ie/2011/08/how-to-debug-pig-udfs-in-eclipse.html
我有一个配置单元脚本,我在其中使用了失败的 udaf,所以我想单步执行 udf 代码。

最佳答案

可以从 eclipse IDE 中调试 JUNIT。,因为它是一个 java 类。

考虑这个 UDF。

示例 1

class SimpleHelloWorldUDFExample extends UDF {
public Text evaluate(Text input) {
if(input == null) return null;
return new Text("Hello " + input.toString());
}
}

Junit 测试方法是这样的...

@Test
public void testUDFNullCheck() {
SimpleHelloWorldUDFExample example = new SimpleHelloWorldUDFExample();
Assert.assertNull(example.evaluate(null));
}

示例 2

package com.hive.udftest

import java.util.List;

import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
import org.apache.hadoop.hive.ql.exec.UDFArgumentLengthException;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
import org.apache.hadoop.hive.serde2.objectinspector.ListObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.StringObjectInspector;

class HiveUDFTest extends GenericUDF {

ListObjectInspector listOI;
StringObjectInspector elementOI;

@Override
public String getDisplayString(String[] arg0) {
return "arrayContainsExample()"; // this should probably be better
}

@Override
public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumentException {
if (arguments.length != 2) {
throw new UDFArgumentLengthException("arrayContainsExample only takes 2 arguments: List<T>, T");
}
// 1. Check we received the right object types.
ObjectInspector a = arguments[0];
ObjectInspector b = arguments[1];
if (!(a instanceof ListObjectInspector) || !(b instanceof StringObjectInspector)) {
throw new UDFArgumentException("first argument must be a list / array, second argument must be a string");
}
this.listOI = (ListObjectInspector) a;
this.elementOI = (StringObjectInspector) b;

// 2. Check that the list contains strings
if(!(listOI.getListElementObjectInspector() instanceof StringObjectInspector)) {
throw new UDFArgumentException("first argument must be a list of strings");
}

// the return type of our function is a boolean, so we provide the correct object inspector
return PrimitiveObjectInspectorFactory.javaBooleanObjectInspector;
}

@Override
public Object evaluate(DeferredObject[] arguments) throws HiveException {

// get the list and string from the deferred objects using the object inspectors
List<String> list = (List<String>) this.listOI.getList(arguments[0].get());
String arg = elementOI.getPrimitiveJavaObject(arguments[1].get());

// check for nulls
if (list == null || arg == null) {
return null;
}

// see if our list contains the value we need
for(String s: list) {
if (arg.equals(s)) return new Boolean(true);
}
return new Boolean(false);
}

}

Junit 测试用例是

package com.hive.udftest

import java.util.ArrayList;
import java.util.List;

import junit.framework.Assert;

import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF.DeferredJavaObject;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF.DeferredObject;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaBooleanObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
import org.junit.Test;

public class HiveUDFTestTest {


@Test
public void testComplexUDFReturnsCorrectValues() throws HiveException {

// set up the models we need
HiveUDFTest example = new HiveUDFTest();
ObjectInspector stringOI = PrimitiveObjectInspectorFactory.javaStringObjectInspector;
ObjectInspector listOI = ObjectInspectorFactory.getStandardListObjectInspector(stringOI);
JavaBooleanObjectInspector resultInspector = (JavaBooleanObjectInspector) example.initialize(new ObjectInspector[]{listOI, stringOI});

// create the actual UDF arguments
List<String> list = new ArrayList<String>();
list.add("a");
list.add("b");
list.add("c");

// test our results

// the value exists
Object result = example.evaluate(new DeferredObject[]{new DeferredJavaObject(list), new DeferredJavaObject("a")});
Assert.assertEquals(true, resultInspector.get(result));

// the value doesn't exist
Object result2 = example.evaluate(new DeferredObject[]{new DeferredJavaObject(list), new DeferredJavaObject("d")});
Assert.assertEquals(false, resultInspector.get(result2));

// arguments are null
Object result3 = example.evaluate(new DeferredObject[]{new DeferredJavaObject(null), new DeferredJavaObject(null)});
Assert.assertNull(result3);
}
}

UDAF、UDTF也是类似的方式...

关于hadoop - java写的hive udf、udaf、udtfs在eclipse这样的ide中怎么调试?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37112444/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com