gpt4 book ai didi

json - Elasticsearch:默认模板不检测日期

转载 作者:行者123 更新时间:2023-12-03 01:55:20 24 4
gpt4 key购买 nike

我有一个默认的模板,看起来像

PUT /_template/abtemp
{
"template": "abt*",
"settings": {
"index.refresh_interval": "5s",
"number_of_shards": 5,
"number_of_replicas": 1,
"index.codec": "best_compression"
},
"mappings": {
"_default_": {
"_all": {
"enabled": false
},
"_source": {
"enabled": true
},
"dynamic_templates": [
{
"message_field": {
"match": "message",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "analyzed",
"omit_norms": true,
"fielddata": {
"format": "disabled"
}
}
}
},
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "analyzed",
"omit_norms": true,
"fielddata": {
"format": "disabled"
},
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
}
}
}
]
}
}
}

这里的想法是这样
  • 将模板应用于名称与abt*匹配的所有索引
  • 仅分析名为message的字符串字段。所有其他字符串字段将为not_analyzed,并将具有相应的.raw字段

  • 现在我尝试将一些数据索引为
    curl -s -XPOST hostName:port/indexName/_bulk --data-binary @myFile.json

    这是文件
    { "index" : { "_index" : "abtclm3","_type" : "test"} }
    { "FIELD1":1, "FIELD2":"2015-11-18 15:32:18"", "FIELD3":"MATTHEWS", "FIELD4":"GARY", "FIELD5":"", "FIELD6":"STARMX", "FIELD7":"AL", "FIELD8":"05/15/2010 11:30", "FIELD9":"05/19/2010 7:00", "FIELD10":"05/19/2010 23:00", "FIELD11":3275, "FIELD12":"LC", "FIELD13":"WIN", "FIELD14":"05/15/2010 11:30", "FIELD15":"LC", "FIELD16":"POTUS", "FIELD17":"WH", "FIELD18":"S GROUNDS", "FIELD19":"OFFICE", "FIELD20":"VISITORS", "FIELD21":"STATE ARRIVAL - MEXICO**", "FIELD22":"08/27/2010 07:00:00 AM +0000", "FIELD23":"MATTHEWS", "FIELD24":"GARY", "FIELD25":"", "FIELD26":"STARMX", "FIELD27":"AL", "FIELD28":"05/15/2010 11:30", "FIELD29":"05/19/2010 7:00", "FIELD30":"05/19/2010 23:00", "FIELD31":3275, "FIELD32":"LC", "FIELD33":"WIN", "FIELD34":"05/15/2010 11:30", "FIELD35":"LC", "FIELD36":"POTUS", "FIELD37":"WH", "FIELD38":"S GROUNDS", "FIELD39":"OFFICE", "FIELD40":"VISITORS", "FIELD41":"STATE ARRIVAL - MEXICO**", "FIELD42":"08/27/2010 07:00:00 AM +0000" }

    请注意,应将一些字段(例如 FIELD2)归类为 date。同样, FIELD31应该被分类为 long。因此发生了索引,当我查看数据时,我看到数字已正确分类,但其他所有内容都放在 string下。我如何确保具有时间戳的字段被分类为 date

    最佳答案

    那里有很多日期格式。您需要这样的模板:

    {
    "template": "abt*",
    "settings": {
    "index.refresh_interval": "5s",
    "number_of_shards": 5,
    "number_of_replicas": 1,
    "index.codec": "best_compression"
    },
    "mappings": {
    "_default_": {
    "dynamic_date_formats":["dateOptionalTime||yyyy-mm-dd HH:mm:ss||mm/dd/yyyy HH:mm||mm/dd/yyyy HH:mm:ss aa ZZ"],
    "_all": {
    "enabled": false
    },
    "_source": {
    "enabled": true
    },
    "dynamic_templates": [
    {
    "message_field": {
    "match": "message",
    "match_mapping_type": "string",
    "mapping": {
    "type": "string",
    "index": "analyzed",
    "omit_norms": true,
    "fielddata": {
    "format": "disabled"
    }
    }
    }
    },
    {
    "dates": {
    "match": "*",
    "match_mapping_type": "date",
    "mapping": {
    "type": "date",
    "format": "dateOptionalTime||yyyy-mm-dd HH:mm:ss||mm/dd/yyyy HH:mm||mm/dd/yyyy HH:mm:ss aa ZZ"
    }
    }
    },
    {
    "string_fields": {
    "match": "*",
    "match_mapping_type": "string",
    "mapping": {
    "type": "string",
    "index": "analyzed",
    "omit_norms": true,
    "fielddata": {
    "format": "disabled"
    },
    "fields": {
    "raw": {
    "type": "string",
    "index": "not_analyzed",
    "ignore_above": 256
    }
    }
    }
    }
    }
    ]
    }
    }
    }

    这可能无法涵盖其中的所有格式,您需要添加其余的格式。这个想法是在由 dynamic_date_formats分隔的 ||下指定它们,然后在 format字段本身的 date字段下指定它们。

    要了解定义它们需要做什么,请参阅 this section of the documentation了解内置格式,并查看 this piece of documentation了解您打算使用的任何自定义格式。

    关于json - Elasticsearch:默认模板不检测日期,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37118295/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com