gpt4 book ai didi

ElasticSearch post_filter 和过滤聚合的行为不同

转载 作者:行者123 更新时间:2023-11-29 02:45:38 24 4
gpt4 key购买 nike

我已经花了整整一周的时间来解决这个问题,但没有希望解决它。我正在关注这个(很老)article on e-commerce search and faceted filtering等,到目前为止它运行良好(搜索结果很好,当在查询中应用过滤器时聚合效果很好。我使用的是 ElasticSearch 6.1.1。

但是因为我想让我的用户在构面上执行多项选择,所以我将过滤器移到了 post_filter 部分。此 仍然运行良好 ,它会正确过滤结果并准确显示整个文档集的聚合计数。

阅读后this question on StackOverflow ,我意识到我必须使用“过滤”聚合和“特殊”聚合来执行一些疯狂的杂技,以相互修剪聚合以显示正确的计数并允许同时使用多个过滤器。我已经要求对该问题进行一些澄清,但尚未得到答复(这是一个老问题)。

我一直在努力解决的问题是在 上获得一组过滤聚合。嵌套字段 其中所有方面都用所有过滤器过滤。

我的计划是使用一般聚合(未过滤)并保持选定的方面聚合未过滤(以便我可以选择多个条目)但使用当前选择的方面过滤所有其他聚合,以便我只能显示我仍然可以的过滤器申请。

但是,如果我在文档上使用相同的过滤器(工作正常),并将过滤器放入过滤后的聚合中,它们将无法按预期工作。计数全错了。我知道聚合是在过滤器之前计算的,这就是我在我想要的聚合上复制过滤器的原因。

这是我的查询:

  "query": {
"bool": {
"must": [
{
"multi_match": {
"fields": [
"search_data.full_text_boosted^7",
"search_data.full_text^2"
],
"type": "cross_fields",
"analyzer": "full_text_search_analyzer",
"query": "some book"
}
}
]
}
}

这里没什么特别的,它运行良好并返回相关结果。

这是我的过滤器(在 post_filter 中):
"post_filter" : {
"bool" : {
"must" : [
{
"nested": {
"path": "string_facets",
"query": {
"bool" : {
"filter" :
[
{ "term" : { "string_facets.facet_name" : "Cover colour" } },
{ "terms" : { "string_facets.facet_value" : [ "Green" ] } }
]
}
}
}
}

]
}
}

让我强调一下: 这很好用 .我看到了正确的结果(在这种情况下,显示了 '13' 结果,所有结果都匹配正确的字段 - 'Cover colour' = 'Green' )。

这是我的一般(未过滤的聚合),它返回所有产品的所有方面的正确计数:
    "agg_string_facets": {
"nested": {
"path": "string_facets"
},
"aggregations": {
"facet_name": {
"terms": {
"field": "string_facets.facet_name"
},
"aggregations": {
"facet_value": {
"terms": {
"field": "string_facets.facet_value"
}
}
}
}
}
}

这也很完美 !我看到了所有与我的查询匹配的文档的准确方面计数的所有聚合。

现在,检查一下:我正在为相同的嵌套字段创建聚合,但已过滤,以便我可以获得“幸存”我的过滤器的聚合 + 构面:
"agg_all_facets_filtered" : {

"filter" : {
"bool" : {
"must" : [
{
"nested": {
"path": "string_facets",
"query": {
"bool" : {
"filter" : [
{ "term" : { "string_facets.facet_name" : "Cover colour" } },
{ "terms" : { "string_facets.facet_value" : [ "Green" ] } }
]
}
}
}
}]
}
},
"aggs" : {
"agg_all_facets_filtered" : {
"nested": { "path": "string_facets" },
"aggregations": {
"facet_name": {
"terms": { "field": "string_facets.facet_name" },
"aggregations": {
"facet_value": {
"terms": { "field": "string_facets.facet_value" }
}
}
}
}
}

}

请注意我在此聚合中使用的过滤器是 与首先过滤我的结果的相同 (在帖子中)。

但是由于某种原因,返回的聚合都是错误的,即刻面计数。例如,在我在这里的搜索中,我得到了 13 个结果,但从 'agg_all_facets_filtered' 返回的聚合只有一个计数:'Cover colour' = 4。
{
"key": "Cover colour",
"doc_count": 4,
"facet_value": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Green",
"doc_count": 4
}
]
}
}

在检查了为什么 4 之后,我注意到其中 3 个文档包含两次“封面颜色”方面:一次是“绿色”,一次是“其他颜色”……所以我的聚合似乎只计算具有该颜色的条目分面名称 TWICE - 或与其他文档相同。这就是为什么我认为我对聚合的过滤器是错误的。我已经阅读了很多关于匹配/过滤器的 AND 与 OR 的阅读,我尝试过“过滤器”、“应该”等。没有什么能解决这个问题。

很抱歉,这是一个很长的问题,但是:

考虑到我的过滤器本身可以完美运行,我如何编写聚合过滤器以便返回的构面具有正确的计数?

非常感谢大家。

更新:例如以下请求,这是我的完整查询(请注意 post_filter 中的过滤器以及过滤聚合中的相同过滤器):
{
"size" : 0,
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": [
"search_data.full_text_boosted^7",
"search_data.full_text^2"
],
"type": "cross_fields",
"analyzer": "full_text_search_analyzer",
"query": "bible"
}
}
]
}
},

"post_filter" : {

"bool" : {
"must" : [
{
"nested": {
"path": "string_facets",
"query": {
"bool" : {
"filter" :
[
{ "term" : { "string_facets.facet_name" : "Cover colour" } },
{ "terms" : { "string_facets.facet_value" : [ "Green" ] } }
]
}
}
}
}

]
}

},

"aggregations": {

"agg_string_facets": {
"nested": {
"path": "string_facets"
},
"aggregations": {
"facet_name": {
"terms": {
"field": "string_facets.facet_name"
},
"aggregations": {
"facet_value": {
"terms": {
"field": "string_facets.facet_value"
}
}
}
}
}
},

"agg_all_facets_filtered" : {

"filter" : {
"bool" : {
"must" : [
{
"nested": {
"path": "string_facets",
"query": {
"bool" : {
"filter" : [
{ "term" : { "string_facets.facet_name" : "Cover colour" } },
{ "terms" : { "string_facets.facet_value" : [ "Green" ] } }
]
}
}
}
}]
}
},
"aggs" : {
"agg_all_facets_filtered" : {
"nested": { "path": "string_facets" },
"aggregations": {
"facet_name": {
"terms": { "field": "string_facets.facet_name" },
"aggregations": {
"facet_value": {
"terms": { "field": "string_facets.facet_value" }
}
}
}
}
}

}


}

}
}

返回的结果是正确的(就文档而言),这里是聚合(未过滤,从结果中,对于 'agg_string_facets' - 注意 'Green' 显示 13 个文档 - 这是正确的):
{
"key": "Cover colour",
"doc_count": 483,
"facet_value": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 111,
"buckets": [
{
"key": "Black",
"doc_count": 87
},
{
"key": "Brown",
"doc_count": 75
},
{
"key": "Blue",
"doc_count": 45
},
{
"key": "Burgundy",
"doc_count": 43
},
{
"key": "Pink",
"doc_count": 30
},
{
"key": "Teal",
"doc_count": 27
},
{
"key": "Tan",
"doc_count": 20
},
{
"key": "White",
"doc_count": 18
},
{
"key": "Chocolate",
"doc_count": 14
},
{
"key": "Green",
"doc_count": 13
}
]
}
}

这是聚合(使用相同的过滤器过滤,同时来自 'agg_all_facets_filtered'),仅显示 4 的 'Green':
{
"key": "Cover colour",
"doc_count": 4,
"facet_value": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Green",
"doc_count": 4
}
]
}
}

更新 2:以下是查询返回的一些示例文档:
"hits": {
"total": 13,
"max_score": 17.478987,
"hits": [
{
"_index": "redacted",
"_type": "product",
"_id": "33107",
"_score": 17.478987,
"_source": {
"type": "product",
"document_id": 33107,
"search_data": {
"full_text": "hcsb compact ultrathin bible mint green leathertouch holman bible staff leather binding 9781433617751 ",
"full_text_boosted": "HCSB Compact Ultrathin Bible Mint Green Leathertouch Holman Bible Staff "
},
"search_result_data": {
"name": "HCSB Compact Ultrathin Bible, Mint Green Leathertouch (Leather)",
"preview_image": "/images/products/medium/0.jpg",
"url": "/Products/ViewOne.aspx?ProductId=33107"
},
"string_facets": [
{
"facet_name": "Binding",
"facet_value": "Leather"
},
{
"facet_name": "Bible size",
"facet_value": "Compact"
},
{
"facet_name": "Bible size",
"facet_value": "Ultrathin"
},
{
"facet_name": "Bible version",
"facet_value": "HCSB"
},
{
"facet_name": "Cover colour",
"facet_value": "Green"
}
]
}
},
{
"_index": "redacted",
"_type": "product",
"_id": "17240",
"_score": 17.416323,
"_source": {
"type": "product",
"document_id": 17240,
"search_data": {
"full_text": "kjv thinline bible compact leather binding 9780310439189 ",
"full_text_boosted": "KJV Thinline Bible Compact "
},
"search_result_data": {
"name": "KJV Thinline Bible, Compact (Leather)",
"preview_image": "/images/products/medium/17240.jpg",
"url": "/Products/ViewOne.aspx?ProductId=17240"
},
"string_facets": [
{
"facet_name": "Binding",
"facet_value": "Leather"
},
{
"facet_name": "Bible size",
"facet_value": "Compact"
},
{
"facet_name": "Bible size",
"facet_value": "Thinline"
},
{
"facet_name": "Bible version",
"facet_value": "KJV"
},
{
"facet_name": "Cover colour",
"facet_value": "Green"
}
]
}
},
{
"_index": "redacted",
"_type": "product",
"_id": "17243",
"_score": 17.416323,
"_source": {
"type": "product",
"document_id": 17243,
"search_data": {
"full_text": "kjv busy mom's bible leather binding 9780310439134 ",
"full_text_boosted": "KJV Busy Mom'S Bible "
},
"search_result_data": {
"name": "KJV Busy Mom's Bible (Leather)",
"preview_image": "/images/products/medium/17243.jpg",
"url": "/Products/ViewOne.aspx?ProductId=17243"
},
"string_facets": [
{
"facet_name": "Binding",
"facet_value": "Leather"
},
{
"facet_name": "Bible size",
"facet_value": "Pocket"
},
{
"facet_name": "Bible size",
"facet_value": "Thinline"
},
{
"facet_name": "Bible version",
"facet_value": "KJV"
},
{
"facet_name": "Cover colour",
"facet_value": "Pink"
},
{
"facet_name": "Cover colour",
"facet_value": "Green"
}
]
}
},
{
"_index": "redacted",
"_type": "product",
"_id": "33030",
"_score": 15.674053,
"_source": {
"type": "product",
"document_id": 33030,
"search_data": {
"full_text": "apologetics study bible for students grass green leathertou mcdowell sean; holman bible s leather binding 9781433617720 ",
"full_text_boosted": "Apologetics Study Bible For Students Grass Green Leathertou Mcdowell Sean; Holman Bible S"
},
"search_result_data": {
"name": "Apologetics Study Bible For Students, Grass Green Leathertou (Leather)",
"preview_image": "/images/products/medium/33030.jpg",
"url": "/Products/ViewOne.aspx?ProductId=33030"
},
"string_facets": [
{
"facet_name": "Binding",
"facet_value": "Leather"
},
{
"facet_name": "Bible designation",
"facet_value": "Study Bible"
},
{
"facet_name": "Bible designation",
"facet_value": "Students"
},
{
"facet_name": "Bible feature",
"facet_value": "Indexed"
},
{
"facet_name": "Cover colour",
"facet_value": "Green"
}
]
}
},
{
"_index": "redacted",
"_type": "product",
"_id": "33497",
"_score": 15.674053,
"_source": {
"type": "product",
"document_id": 33497,
"search_data": {
"full_text": "hcsb life essentials study bible brown / green getz gene a.; holman bible st imitation leather 9781586400446 ",
"full_text_boosted": "HCSB Life Essentials Study Bible Brown Green Getz Gene A ; Holman Bible St"
},
"search_result_data": {
"name": "HCSB Life Essentials Study Bible Brown / Green (Imitation Leather)",
"preview_image": "/images/products/medium/33497.jpg",
"url": "/Products/ViewOne.aspx?ProductId=33497"
},
"string_facets": [
{
"facet_name": "Binding",
"facet_value": "Imitation Leather"
},
{
"facet_name": "Bible designation",
"facet_value": "Study Bible"
},
{
"facet_name": "Bible version",
"facet_value": "HCSB"
},
{
"facet_name": "Binding",
"facet_value": "Imitation leather"
},
{
"facet_name": "Cover colour",
"facet_value": "Brown"
},
{
"facet_name": "Cover colour",
"facet_value": "Green"
}
]
}
}
}

最佳答案

谜团已揭开!感谢您的输入,事实证明我使用的版本(6.1.1)有一个错误。我不知道错误到底是什么,但我已经安装了 ElasticSearch 6.5,重新索引了我的数据并且没有更改查询或映射,一切正常!

现在,我不知道我是否应该向 ES 提交错误报告,或者只是留下它,因为它是一个较旧的版本并且他们已经继续前进。

关于ElasticSearch post_filter 和过滤聚合的行为不同,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53888524/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com