gpt4 book ai didi

java - 在Hadoop提取期间Druid空间维度加载数据错误

转载 作者:行者123 更新时间:2023-12-02 21:58:35 25 4
gpt4 key购买 nike

  • 我有数据的Hadoop吸收过程(就像https://druid.apache.org/docs/latest/ingestion/hadoop.html一样)
  • 当前的Druid索引器版本为 0.14.2-正在孵化
  • 数据是GCS上的TSV文件。

  • 以前使用过旧版本的德鲁伊索引器,没有任何问题。升级到新版本后出现错误。

    一些细节

    这是我的规范中的解析部分:
          "parser": {
    "parseSpec": {
    "dimensionsSpec": {
    "spatialDimensions": [
    {
    "dimName": "geo",
    "dims": ["latitude", "longitude"]
    }
    ],
    "dimensionExclusions": [],
    "dimensions":[
    "ip_address",
    "radius",
    "confidence"
    ]
    },
    "timestampSpec": {
    "format": "millis",
    "column": "ts"
    },
    "columns": [
    "ts",
    "ip_address",
    "latitude",
    "longitude",
    "radius",
    "confidence"
    ],
    "format":"tsv"
    },
    "type": "lzo"
    }
    },

    本部分导致出现以下错误:
    java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.druid.cli.CliHadoopIndexer.run(CliHadoopIndexer.java:116)
    at org.apache.druid.cli.Main.main(Main.java:118)
    Caused by: java.lang.IllegalArgumentException: Instantiation of [simple type, class org.apache.druid.data.input.impl.DelimitedParseSpec] value failed: column[geo] not in columns. (through reference chain: org.apache.druid.data.input.impl.StringInputRowParser["parseSpec"])
    at shade.com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:3459)
    at shade.com.fasterxml.jackson.databind.ObjectMapper.convertValue(ObjectMapper.java:3378)
    at org.apache.druid.segment.indexing.DataSchema.getParser(DataSchema.java:126)
    at org.apache.druid.indexer.HadoopDruidIndexerConfig.verify(HadoopDruidIndexerConfig.java:591)
    at org.apache.druid.indexer.HadoopDruidIndexerJob.<init>(HadoopDruidIndexerJob.java:49)
    at org.apache.druid.cli.CliInternalHadoopIndexer.run(CliInternalHadoopIndexer.java:124)
    at org.apache.druid.cli.Main.main(Main.java:118)
    ... 6 more
    Caused by: shade.com.fasterxml.jackson.databind.JsonMappingException: Instantiation of [simple type, class org.apache.druid.data.input.impl.DelimitedParseSpec] value failed: column[geo] not in columns. (through reference chain: org.apache.druid.data.input.impl.StringInputRowParser["parseSpec"])
    at shade.com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.wrapException(StdValueInstantiator.java:399)
    at shade.com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:231)
    at shade.com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:135)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:442)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1099)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:296)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:166)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:136)
    at shade.com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:122)
    at shade.com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:93)
    at shade.com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:131)
    at shade.com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:518)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeWithErrorWrapping(BeanDeserializer.java:463)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:378)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1099)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:296)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:166)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:136)
    at shade.com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:122)
    at shade.com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:93)
    at shade.com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:131)
    at shade.com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:42)
    at shade.com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:3454)
    ... 12 more
    Caused by: java.lang.IllegalArgumentException: column[geo] not in columns.
    at shade.com.google.common.base.Preconditions.checkArgument(Preconditions.java:148)
    at org.apache.druid.data.input.impl.DelimitedParseSpec.verify(DelimitedParseSpec.java:119)
    at org.apache.druid.data.input.impl.DelimitedParseSpec.<init>(DelimitedParseSpec.java:63)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at shade.com.fasterxml.jackson.databind.introspect.AnnotatedConstructor.call(AnnotatedConstructor.java:125)
    at shade.com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:227)
    ... 33 more


    我看到规范解析器试图在列之间定位维度,但这是空间维度!

    这是困扰生产的相当痛苦的问题。
    有人有任何想法如何解决此错误吗?

    最佳答案

    "parser": {
    "type": "string",
    "parseSpec": {
    "format": "json",
    "flattenSpec": {
    "fields": [
    { "type": "path", "name": "Longitude", "expr": "$.location.lon" },
    { "type": "path", "name": "Latitude", "expr": "$.location.lat" }
    ]
    },
    "timestampSpec": {
    "column": "timeStamp",
    "format": "auto"
    },
    "dimensionsSpec": {
    "dimensions": ["blogid", "category","eventType","userid" ],
    "spatialDimensions": [
    {
    "dimName": "coordinates",
    "dims": ["Latitude", "Longitude"]
    }
    ]
    }
    }
    }

    关于java - 在Hadoop提取期间Druid空间维度加载数据错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57221662/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com