gpt4 book ai didi

java - Hive 通用 UDTF 因数组索引越界错误而失败

转载 作者:行者123 更新时间:2023-12-01 17:21:43 27 4
gpt4 key购买 nike

这是关于 Hive 通用 UDTF。

该程序的目的是以一个字符串列作为输入,在按空格分割输入列(字符串)后输出应为多行。已生成 jar 文件并将该 jar 添加到 hive shell 中,还为类名创建了临时函数。调用函数时出现数组索引越界错误。

代码:

package com.suba.customHiveUdfs;

import java.util.ArrayList;
import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDTF;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
public class MyUdtf extends GenericUDTF {
ArrayList<String> colList = new ArrayList<>(1);
ArrayList<ObjectInspector> oiList = new ArrayList<>(1);
PrimitiveObjectInspector poi = null;
@Override
public StructObjectInspector initialize(ObjectInspector[] argOIs) throws UDFArgumentException {
// TODO Auto-generated method stub
if (argOIs.length > 1) {
throw new UDFArgumentException("invalid argument");
}
if (argOIs[0].getCategory() != ObjectInspector.Category.PRIMITIVE) {
throw new UDFArgumentException("primitive expected");
}
if (((PrimitiveObjectInspector) argOIs[0])
.getPrimitiveCategory() != PrimitiveObjectInspector.PrimitiveCategory.STRING) {
throw new UDFArgumentException("not string type");
}
poi = (PrimitiveObjectInspector) argOIs[0];
colList.add("name");
oiList.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector);
return ObjectInspectorFactory.getStandardStructObjectInspector(colList, oiList);
}
@Override
public void process(Object[] arg0) throws HiveException {
String name = ((PrimitiveObjectInspector) poi).getPrimitiveJavaObject(arg0[0]).toString();
String[] tokens = name.split(" ");
for (String x : tokens) {
Object[] objects = new Object[] { x };
forward(objects);
}
}
@Override
public void close() throws HiveException {
}
}

错误消息如下所示:获取数组索引越界错误。

Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
at java.util.Arrays$ArrayList.get(Arrays.java:3841)
at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:417)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:592)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
at org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:125)
at org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:45)
at org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:107)
at com.suba.customHiveUdfs.MyUdtf.process(MyUdtf.java:61)
at org.apache.hadoop.hive.ql.exec.UDTFOperator.processOp(UDTFOperator.java:108)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:539)
... 9 more

最佳答案

一旦更改或在流程方法内循环为..,问题就解决了

    for (String x : tokens) {
String string[] = new String[] { x };

forward(string);
}

关于java - Hive 通用 UDTF 因数组索引越界错误而失败,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61280608/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com