gpt4 book ai didi

elasticsearch - ElasticSearch-使用路径标记器按父分组

转载 作者:行者123 更新时间:2023-12-02 23:35:09 24 4
gpt4 key购买 nike

我有以下文件:

PUT /my_index/topic/1
{
"path" : "fruits"
}

PUT /my_index/topic/2
{
"path" : "fruits/apple"
}

PUT /my_index/topic/3
{
"path" : "fruits/pear"
}

PUT /my_index/topic/4
{
"path" : "vegetables"
}

PUT /my_index/topic/5
{
"path" : "vegetables/carrot"
}

PUT /my_index/topic/6
{
"path" : "vegetables/broccoli"
}

我正在努力弄清楚如何汇总这些文档,以便获得以下输出:

水果 {
苹果,
pear
}

蔬菜{
胡萝卜,
西兰花
}

最佳答案

我发现做到这一点的一种方法是使用 path_hierarchy tokenizer token_count 字段。首先,我们像这样创建my_index:

curl -XPUT localhost:9200/my_index -d '{
"settings": {
"analysis": {
"analyzer": {
"path-analyzer": {
"type": "custom",
"tokenizer": "path-tokenizer"
}
},
"tokenizer": {
"path-tokenizer": {
"type": "path_hierarchy"
}
}
}
},
"mappings": {
"topic": {
"properties": {
"path": {
"type": "string",
"index_analyzer": "path-analyzer",
"fields": {
"tokens": {
"type": "token_count",
"store": "yes",
"analyzer": "standard"
}
}
}
}
}
}
}'

然后,我们使用与您的问题相同的PUT查询为您的文档建立索引。

最后,可以输出如下期望的搜索查询:
curl -XPOST localhost:9200/my_index/topic/_search?pretty -d '{
"size": 0,
"aggs": {
"first_level": {
"terms": {
"field": "path",
"exclude": ".*/.*"
},
"aggs": {
"second_level": {
"filter": {
"term": {
"path.tokens": 2
}
},
"aggs": {
"type": {
"terms": {
"field": "path",
"include": ".*/.*"
}
}
}
}
}
}
}
}'

并输出:
{
...
"aggregations" : {
"first_level" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ {
"key" : "fruits",
"doc_count" : 3,
"second_level" : {
"doc_count" : 2,
"type" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ {
"key" : "fruits/apple",
"doc_count" : 1
}, {
"key" : "fruits/pear",
"doc_count" : 1
} ]
}
}
}, {
"key" : "vegetables",
"doc_count" : 3,
"second_level" : {
"doc_count" : 2,
"type" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ {
"key" : "vegetables/broccoli",
"doc_count" : 1
}, {
"key" : "vegetables/carrot",
"doc_count" : 1
} ]
}
}
} ]
}
}
}

关于elasticsearch - ElasticSearch-使用路径标记器按父分组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31926206/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com