gpt4 book ai didi

java - GenericRecord 的 Avro 架构 : Be able to leave blank fields

转载 作者:行者123 更新时间:2023-12-02 08:54:13 28 4
gpt4 key购买 nike

我正在使用 Java 将 JSON 转换为 Avro,并使用 Google DataFlow 将它们存储到 GCS。Avro 架构是在运行时使用 SchemaBuilder 创建的。

我在架构中定义的字段之一是可选的 LONG 字段,其定义如下:

SchemaBuilder.FieldAssembler<Schema> fields = SchemaBuilder.record(mainName).fields();
Schema concreteType = SchemaBuilder.nullable().longType();
fields.name("key1").type(concreteType).noDefault();

现在,当我使用上面的架构创建 GenericRecord 且未设置“key1”时,将生成的 GenericRecord 放入我的 DoFn 的上下文中时: context.output(res); 我得到出现以下错误:

Exception in thread "main" org.apache.beam.sdk.Pipeline$PipelineExecutionException: org.apache.avro.UnresolvedUnionException: Not in union ["long","null"]: 256

我也尝试使用 withDefault(0L) 做同样的事情,并得到了相同的结果。

我想念什么?谢谢

最佳答案

当尝试如下时,它对我来说效果很好,您可以尝试打印有助于比较的模式,您也可以删除长类型的 nullable() 来尝试。

fields.name("key1").type().nullable().longType().longDefault(0);

提供了我用来测试的完整代码:

import org.apache.avro.AvroRuntimeException;
import org.apache.avro.Schema;
import org.apache.avro.SchemaBuilder;
import org.apache.avro.SchemaBuilder.FieldAssembler;
import org.apache.avro.SchemaBuilder.RecordBuilder;
import org.apache.avro.file.DataFileReader;
import org.apache.avro.file.DataFileWriter;
import org.apache.avro.generic.GenericData.Record;
import org.apache.avro.generic.GenericDatumReader;
import org.apache.avro.generic.GenericDatumWriter;
import org.apache.avro.generic.GenericRecord;
import org.apache.avro.generic.GenericRecordBuilder;
import org.apache.avro.io.DatumReader;
import org.apache.avro.io.DatumWriter;

import java.io.File;
import java.io.IOException;

public class GenericRecordExample {

public static void main(String[] args) {

FieldAssembler<Schema> fields;
RecordBuilder<Schema> record = SchemaBuilder.record("Customer");
fields = record.namespace("com.example").fields();
fields = fields.name("first_name").type().nullable().stringType().noDefault();
fields = fields.name("last_name").type().nullable().stringType().noDefault();
fields = fields.name("account_number").type().nullable().longType().longDefault(0);

Schema schema = fields.endRecord();
System.out.println(schema.toString());

// we build our first customer
GenericRecordBuilder customerBuilder = new GenericRecordBuilder(schema);
customerBuilder.set("first_name", "John");
customerBuilder.set("last_name", "Doe");
customerBuilder.set("account_number", 999333444111L);
Record myCustomer = customerBuilder.build();
System.out.println(myCustomer);

// writing to a file
final DatumWriter<GenericRecord> datumWriter = new GenericDatumWriter<>(schema);
try (DataFileWriter<GenericRecord> dataFileWriter = new DataFileWriter<>(datumWriter)) {
dataFileWriter.create(myCustomer.getSchema(), new File("customer-generic.avro"));
dataFileWriter.append(myCustomer);
System.out.println("Written customer-generic.avro");
} catch (IOException e) {
System.out.println("Couldn't write file");
e.printStackTrace();
}

// reading from a file
final File file = new File("customer-generic.avro");
final DatumReader<GenericRecord> datumReader = new GenericDatumReader<>();
GenericRecord customerRead;
try (DataFileReader<GenericRecord> dataFileReader = new DataFileReader<>(file, datumReader)){
customerRead = dataFileReader.next();
System.out.println("Successfully read avro file");
System.out.println(customerRead.toString());

// get the data from the generic record
System.out.println("First name: " + customerRead.get("first_name"));

// read a non existent field
System.out.println("Non existent field: " + customerRead.get("not_here"));
}
catch(IOException e) {
e.printStackTrace();
}
}
}

关于java - GenericRecord 的 Avro 架构 : Be able to leave blank fields,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60591364/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com