gpt4 book ai didi

elasticsearch - Elasticsearch:在搜索查询期间生成聚合字段

转载 作者:行者123 更新时间:2023-12-03 00:01:32 26 4
gpt4 key购买 nike

ES的新功能。我正在尝试使用1索引中的以下架构从源实现搜索引擎:

index:paper
{
"title": string,
"author": string,
"id": string,
"references": [string:another_paper.id, string:another_paper.id, ...],
"pubDate": date
}

假设我想在2017年1月9日至2017年1月30日期之间搜索所有与作者“史密斯”有关的论文。

我将如何设计搜索查询以使用生成的字段获得结果,该字段显示每个文档在“引用”字段下被其他文档引用了多少次?在ES中甚至有可能吗?

执行速度并不重要,我可以忍受相对较慢的执行速度,但是当我上载新文档时,我不想更新现有文档。

谢谢

最佳答案

您绝对可以根据作者姓名和日期范围获得结果。
通过此查询,您可以获取与查询匹配的文档所引用的文档数以及该文档的计数。

简而言之,您可以获得基于其他文档的引用文档数

例如,假设您索引了3个文档

{
"title": "title1",
"author": "bob",
"id": "id1",
"references": [
"id1",
"id2",
"id3"
],
"pubDate": "01-01-2018"
},
{
"title": "title2",
"author": "harry",
"id": "id2",
"references": [
"id1",
"id3",
"id7",
"id8"
],
"pubDate": "01-02-2018"
},
{
"title": "title3",
"author": "bob",
"id": "id3",
"references": [
"id1",
"id4",
"id7",
"id9"
],
"pubDate": "01-03-2018"
}

之后,您可以触发查询
GET test_stackoverflow_agg/type1/_search
{
"query": {
"query_string": {
"query": "author:bob AND pubDate:[2018-01-02 TO 2018-01-04]"
}
},
"aggs": {
"agg1": {
"terms": {
"field": "references",
"size": 10
}
}
}
}

查询部分将告诉您要过滤的文档和

汇总部分会告诉您要在哪个字段上获取引用字段中存在的唯一ID的计数

这是结果的样子
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1.0460204,
"hits": [
{
"_index": "test_stackoverflow_agg",
"_type": "type1",
"_id": "id3",
"_score": 1.0460204,
"_source": {
"title": "title3",
"author": "bob",
"id": "id3",
"references": [
"id1",
"id4",
"id7",
"id9"
],
"pubDate": "2018-01-03"
}
},
{
"_index": "test_stackoverflow_agg",
"_type": "type1",
"_id": "id1",
"_score": 1.0460204,
"_source": {
"title": "title1",
"author": "bob",
"id": "id1",
"references": [
"id1",
"id2",
"id3"
],
"pubDate": "2018-01-02"
}
}
]
},
"aggregations": {
"agg1": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "id1",
"doc_count": 2
},
{
"key": "id2",
"doc_count": 1
},
{
"key": "id3",
"doc_count": 1
},
{
"key": "id4",
"doc_count": 1
},
{
"key": "id7",
"doc_count": 1
},
{
"key": "id9",
"doc_count": 1
}
]
}
}
}

关于elasticsearch - Elasticsearch:在搜索查询期间生成聚合字段,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48588665/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com