gpt4 book ai didi

elasticsearch - 如何在Elasticsearch中获得总单词出现次数?

转载 作者:行者123 更新时间:2023-12-03 02:17:49 25 4
gpt4 key购买 nike

有没有一种方法来获取搜索到的字符串出现的总数而不是结果命中数?
嵌套文档的数据结构有点复杂,但是我在下面添加了数据的简单版本。如果有人能够帮助您找到答案,我可以将其转换为我的代码版本。
Elasticsearch 数据为:

[
{
"page": 1,
"text": "Sample PDF Document.\nLorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum."
},
{
"page": 2,
"text": "sample PDF sample Document test content"
},
{
"page": 3,
"text": "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.\n sample content"
},
{
"page": 4,
"text": "PDF test sample Document lorem ipsum sample.Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. Sample content."
},
{
"page": 5,
"text": "PDF Document"
},
{
"page": 6,
"text": "sdsd"
},
{
"page": 7,
"text": "lorem ipsum"
}
]
我能够进行过滤器聚合,但是文本 sample PDF sample Document test content将返回计数为1,但单词 sample在同一字段中是两次。

最佳答案

检查此answer。它也可以进行重构以处理嵌套字段,并且仅计算给定的单词子集。注意,由于所有单词拆分都会重复执行,因此速度可能会很慢。

关于elasticsearch - 如何在Elasticsearch中获得总单词出现次数?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63174836/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com