gpt4 book ai didi

elasticsearch - 在 Elasticsearch 查询中检索匹配的数组元素

转载 作者:行者123 更新时间:2023-11-29 02:49:48 24 4
gpt4 key购买 nike

在电影数据库中,我存储了用户对每部电影的评分(0 到 5 星)。我在 Elastic Search(版本 1.2.2)中索引了以下文档结构

"_index": "my_index"
"_type": "film",
"_id": "6629",
"_source": {
"id": "6629",
"title": "Fight Club",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 3 },
{ "user_id" : 4567, "rating_value" : 2 },
{ "user_id" : 7890, "rating_value" : 1 }
.....
]
}

"_index": "my_index"
"_type": "film",
"_id": "6630",
"_source": {
"id": "6630",
"title": "Pulp Fiction",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 1 },
{ "user_id" : 7654, "rating_value" : 2 },
{ "user_id" : 4321, "rating_value" : 5 }
.....
]
}

等...

我的目标是在一次搜索中获得用户(假设用户 1234)评分的所有电影,以及 rating_value

如果我进行以下搜索

GET my_index/film/_search
{
"query": {
"match": {
"ratings.user_id": "1234"
}
}
}

对于所有匹配的电影,我得到了整个文档,然后,我必须解析整个评级数组以找出数组中的哪个元素与我的查询匹配,以及与 user_id 1234 关联的 rating_value 是多少。

理想情况下,我希望这个查询的结果是

"hits": [ {
"_index": "my_index"
"_type": "film",
"_id": "6629",
"_source": {
"id": "6629",
"title": "Fight Club",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 3 }, // <= only the row that matches the query
]
},
"_index": "my_index"
"_type": "film",
"_id": "6630",
"_source": {
"id": "6630",
"title": "Pulp Fiction",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 1 }, // <= only the row that matches the query
]
}
} ]

提前致谢

最佳答案

如我之前的评论所述,我设法使用聚合 检索值。

下面是我是如何做到的。

首先,我使用的映射:

PUT test/movie/_mapping
{
"properties": {
"title":{
"type": "string",
"index": "not_analyzed"
},
"ratings": {
"type": "nested"
}
}
}

我选择不为标题编制索引,但您可以使用 fields 属性并将其保留为“原始”字段。

然后,电影索引:

PUT test/movie/6629
{
"title": "Fight Club",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 3 },
{ "user_id" : 4567, "rating_value" : 2 },
{ "user_id" : 7890, "rating_value" : 1 }
]
}


PUT test/movie/4456
{
"title": "Jumanji",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 4 },
{ "user_id" : 4567, "rating_value" : 3 },
{ "user_id" : 4630, "rating_value" : 5 }
]
}

PUT test/movie/6547
{
"title": "Hook",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 4 },
{ "user_id" : 7890, "rating_value" : 1 }
]
}

聚合查询是:

GET test/movie/_search
{
"aggs": {
"by_movie": {
"terms": {
"field": "title"
},
"aggs": {
"ratings_by_user": {
"nested": {
"path": "ratings"
},"aggs": {
"for_user_1234": {
"filter": {
"term": {
"ratings.user_id": "1234"
}
},
"aggs": {
"rating_value": {
"terms": {
"field": "ratings.rating_value"
}
}
}
}
}
}
}
}
}
}

最后,这是对以前的文档执行此查询时产生的输出:

"aggregations": {
"by_movie": {
"buckets": [
{
"key": "Fight Club",
"doc_count": 1,
"ratings_by_user": {
"doc_count": 3,
"for_user_1234": {
"doc_count": 1,
"rating_value": {
"buckets": [
{
"key": 3,
"key_as_string": "3",
"doc_count": 1
}
]
}
}
}
},
{
"key": "Hook",
"doc_count": 1,
"ratings_by_user": {
"doc_count": 2,
"for_user_1234": {
"doc_count": 1,
"rating_value": {
"buckets": [
{
"key": 4,
"key_as_string": "4",
"doc_count": 1
}
]
}
}
}
},
{
"key": "Jumanji",
"doc_count": 1,
"ratings_by_user": {
"doc_count": 3,
"for_user_1234": {
"doc_count": 1,
"rating_value": {
"buckets": [
{
"key": 4,
"key_as_string": "4",
"doc_count": 1
}
]
}
}
}
}
]
}

由于嵌套语法,这有点乏味,但您将能够为每部电影检索提供的用户(此处为 1234)的评分。

希望这对您有所帮助!

关于elasticsearch - 在 Elasticsearch 查询中检索匹配的数组元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25284609/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com