gpt4 book ai didi

elasticsearch - Elasticsearch术语聚合会跳过一些条目

转载 作者:行者123 更新时间:2023-12-02 23:23:25 25 4
gpt4 key购买 nike

我们使用elasticsearch收集SQL统计信息。
一旦我们注意到某些条目未出现在聚合中。

这是一个示例请求(最初由kibana生成):

POST /_msearch 
{"index":["stat-2017-09-04"],"ignore_unavailable":true,"preference":1504514752086}
{
"query":{
"bool":{
"must":[
{
"query_string":{
"analyze_wildcard":true,
"query":"Group:spbpro.db.sql AND AppUserName:robot"
}
},
{
"range":{
"EndTime":{
"gte":1504503690000,
"lte":1504503692800,
"format":"epoch_millis"
}
}
}
],
"must_not":[

]
}
},
"aggs":{
"3":{
"terms":{
"field":"Name.keyword",
"size":5000,
"order":{
"1":"desc"
}
},
"aggs":{
"1":{
"sum":{
"field":"TotalTime"
}
},
"2":{
"date_histogram":{
"field":"EndTime",
"interval":"20ms",
"time_zone":"Asia/Baghdad",
"min_doc_count":1
},
"aggs":{
"1":{
"sum":{
"field":"TotalTime"
}
}
}
}
}
}
}
}

这是elasticsearch的答案:
{
"responses": [
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 4.754195,
"hits": [
{
"_index": "stat-2017-09-04",
"_type": "stat-spbpro.db.sql",
"_id": "AV5LaI15AUHnqGLtN2GS",
"_score": 4.754195,
"_source": {
"Group": "spbpro.db.sql",
"Name": "select * from (select a.IDPU, sum(d.COUNT)as CNT from ( select IDPU, max(ID) as ID from (select IDPU, ID from PARAMS where IDTPPARAM in (select ID from TPPARAMS where IDTPARC=?)) where ID in (select IDPARAM from DATA_1064_A where DTPU>=? and DTPU<=?) group by IDPU ) a join DATA_1064_A d on d.IDPARAM=a.ID and DTPU>=? and DTPU<=? group by IDPU) where IDPU in (select ID from TEMP_IDS where IDTYPE=1)",
"StartTime": "2017-09-04T05:36:09.0559048Z",
"EndTime": "2017-09-04T05:41:31.7295827Z",
"TotalTime": 297761.8962,
"Count": 13
}
},
{
"_index": "stat-2017-09-04",
"_type": "stat-spbpro.db.sql",
"_id": "AV5LaI15AUHnqGLtN2OF",
"_score": 4.7034826,
"_source": {
"Group": "spbpro.db.sql",
"Name": "select IDPU, count(*) as HRSCNT from PUTEDATAS where DTFR>=? and DTFR<? and IDPU in (select ID from TEMP_IDS where IDTYPE=1) group by IDPU",
"StartTime": "2017-09-04T05:37:06.2981554Z",
"EndTime": "2017-09-04T05:41:32.7463729Z",
"TotalTime": 4277.6874,
"Count": 13
}
}
]
},
"aggregations": {
"3": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"1": {
"value": 4277
},
"2": {
"buckets": [
{
"1": {
"value": 4277
},
"key_as_string": "2017-09-04T08:41:32.740+03:00",
"key": 1504503692740,
"doc_count": 1
}
]
},
"key": "select IDPU, count(*) as HRSCNT from PUTEDATAS where DTFR>=? and DTFR<? and IDPU in (select ID from TEMP_IDS where IDTYPE=1) group by IDPU",
"doc_count": 1
}
]
}
},
"status": 200
}
]
}

聚合中包含“选择IDPU,count(*)作为HRSCNT ...”的存储桶。那是正确的。

但是,为什么“select * from(select a.IDPU ...)仅在匹配中列出而没有出现在聚合中?

Elasticsearch版本是5.0

最佳答案

我认为您的映射可能看起来像这样:

...
"Name": {
"type" "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
...

当您未显式设置映射时,这是字符串的默认映射。意思是,在 keyword字段中不会索引长度超过256个字符的字符串(并且不会在聚合中显示)。参见 ignore_above docs。源仍然被存储,因此您可以在搜索结果中看到它们,并且可以搜索分析的字段( Name)。

您可以通过显式创建映射并省略 ignore_above来解决此问题。您必须将数据重新索引为新索引(您不能更改现有映射)-您可以使用 reindex api轻松完成此操作。如果您只想搜索此字段作为关键字(并且您不希望分析字段),则也可以仅使用一个 keyword字段,如下所示:
...
"Name": {
"type" "keyword"
}
}
...

关于elasticsearch - Elasticsearch术语聚合会跳过一些条目,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46049836/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com