gpt4 book ai didi

arrays - avro 模式中的 optional 数组

转载 作者:行者123 更新时间:2023-12-04 01:40:19 24 4
gpt4 key购买 nike

我想知道是否可以有一个 optional 数组。
让我们假设这样的模式:

{ 
"type": "record",
"name": "test_avro",
"fields" : [
{"name": "test_field_1", "type": "long"},
{"name": "subrecord", "type": [{
"type": "record",
"name": "subrecord_type",
"fields":[{"name":"field_1", "type":"long"}]
},"null"]
},
{"name": "simple_array",
"type":{
"type": "array",
"items": "string"
}
}
]
}

尝试在没有“simple_array”的情况下写入 avro 记录会导致数据文件写入器中出现 NPE。
对于子记录,它很好,但是当我尝试将数组定义为 optional 时:
{"name": "simple_array",
"type":[{
"type": "array",
"items": "string"
}, "null"]

它不会导致 NPE 而是运行时异常:
AvroRuntimeException: Not an array schema: [{"type":"array","items":"string"},"null"]

谢谢。

最佳答案

我认为你想要的是 null 和 array 的联合:

{
"type":"record",
"name":"test_avro",
"fields":[{
"name":"test_field_1",
"type":"long"
},
{
"name":"subrecord",
"type":[{
"type":"record",
"name":"subrecord_type",
"fields":[{
"name":"field_1",
"type":"long"
}
]
},
"null"
]
},
{
"name":"simple_array",
"type":["null",
{
"type":"array",
"items":"string"
}
],
"default":null
}
]
}

当我在 Python 中使用带有示例数据的上述模式时,结果如下( schema_string 是上面的 json 字符串):
>>> from avro import io, datafile, schema
>>> from json import dumps
>>>
>>> sample_data = {'test_field_1':12L}
>>> rec_schema = schema.parse(schema_string)
>>> rec_writer = io.DatumWriter(rec_schema)
>>> rec_reader = io.DatumReader()
>>>
>>> # write avro file
... df_writer = datafile.DataFileWriter(open("/tmp/foo", 'wb'), rec_writer, writers_schema=rec_schema)
>>> df_writer.append(sample_data)
>>> df_writer.close()
>>>
>>> # read avro file
... df_reader = datafile.DataFileReader(open('/tmp/foo', 'rb'), rec_reader)
>>> print dumps(df_reader.next())
{"simple_array": null, "test_field_1": 12, "subrecord": null}

关于arrays - avro 模式中的 optional 数组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9417732/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com