gpt4 book ai didi

elasticsearch - 从elasticsearch获得结果

转载 作者:行者123 更新时间:2023-12-03 02:21:02 25 4
gpt4 key购买 nike

我已经熟悉shingle分析器,并且能够按照以下方式创建带状疱疹的分析器:

    "index": {
"number_of_shards": 10,
"number_of_replicas": 1
},
"analysis": {
"analyzer": {
"shingle_analyzer": {
"filter": [
"standard",
"lowercase"
"filter_shingle"
]
}
},
"filter": {
"filter_shingle": {
"type": "shingle",
"max_shingle_size": 2,
"min_shingle_size": 2,
"output_unigrams": false
}
}
}
}

然后我将 mapping中定义的分析器用于我的文档 content中的一个字段。问题是 content字段是一个很长的文本,我想将其用作自动完成建议程序的数据,所以我只需要一个或两个单词匹配词组之后我想知道是否有一种方法也可以将 search(或 suggestanalyze)API结果作为带状疱疹。通过使用 shingle analyzerelastic本身将文本索引为带状疱疹,是否有办法访问这些带状疱疹?

例如,
我通过的查询是:
GET the_index/_search
{
"_source": ["content"],
"explain": true,

"query" : {
"match" : { "content.shngled_field": "news" }
}
}

结果是:
    {
"took" : 395,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : 7.8647532,
"hits" : [
{
"_shard" : "[v3_kavan_telegram_201911][0]",
"_node" : "L6vHYla-TN6CHo2I6g4M_A",
"_index" : "v3_kavan_telegram_201911",
"_type" : "_doc",
"_id" : "g1music/70733",
"_score" : 7.8647532,
"_source" : {
"content" : "Find the latest breaking news and information on the top stories, weather, business, entertainment, politics, and more."
....
}

如您所见,结果包含整个 content字段,这是一个很长的文本。我期望的结果是
"content" : "news and information on"

这就是匹配的带状疱疹本身。

最佳答案

创建索引并提取文档后

PUT sh
{
"mappings": {
"properties": {
"content": {
"type": "text",
"fields": {
"shingled": {
"type": "text",
"analyzer": "shingle_analyzer"
}
}
}
}
},
"settings": {
"analysis": {
"analyzer": {
"shingle_analyzer": {
"type": "standard",
"filter": [
"standard",
"lowercase",
"filter_shingle"
]
}
},
"filter": {
"filter_shingle": {
"type": "shingle",
"max_shingle_size": 2,
"min_shingle_size": 2,
"output_unigrams": false
}
}
}
}
}

POST sh/_doc/1
{
"content": "and then I use the defined analyzer in mapping for a field in my document named content.The problem is the content field is a very long text and I want to use it as data for a autocomplete suggester, so I just need one or two words that follow the matched phrase. I wonder if there is a way to get the search (or suggest or analyze) API result as shingles too. By using shingle analyzer the elastic itself indexes the text as shingles, is there a way to access those shingles?"
}

您可以通过相应的分析器调用 _analyze来查看给定文本的标记方式:
GET sh/_analyze
{
"text": "and then I use the defined analyzer in mapping for a field in my document named content.The problem is the content field is a very long text and I want to use it as data for a autocomplete suggester, so I just need one or two words that follow the matched phrase. I wonder if there is a way to get the search (or suggest or analyze) API result as shingles too. By using shingle analyzer the elastic itself indexes the text as shingles, is there a way to access those shingles?",
"analyzer": "shingle_analyzer"
}

或查看 term vectors信息:
GET sh/_doc/1/_termvectors
{
"fields" : ["content.shingled"],
"offsets" : true,
"payloads" : true,
"positions" : true,
"term_statistics" : true,
"field_statistics" : true
}

您也会成为 highlighting吗?

关于elasticsearch - 从elasticsearch获得结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62362837/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com