gpt4 book ai didi

elasticsearch - 通过elasticsearch.index, body 结构和映射向elasticsearch添加新文档

转载 作者:行者123 更新时间:2023-12-02 23:12:46 25 4
gpt4 key购买 nike

我正在使用 flask (基于Miguel Grinberg Megatutorial)构建类似博客的应用,并且尝试设置支持自动完成功能的ES索引。我正在努力正确设置索引。

我从(工作)简单的索引机制开始:

from flask import current_app

def add_to_index(index, model):
if not current_app.elasticsearch:
return
payload = {}
for field in model.__searchable__:
payload[field] = getattr(model, field)
current_app.elasticsearch.index(index=index, id=model.id, body=payload)

在与Google一起玩了一段时间之后,我发现我的 body 看起来可能像这样(可能是用了更少的分析仪,但是我正在完全按照我在某处找到它的方法来处理它,作者声称它可以工作):
{
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
"keyword_analyzer": {
"filter": [
"lowercase",
"asciifolding",
"trim"
],
"char_filter": [],
"type": "custom",
"tokenizer": "keyword"
},
"edge_ngram_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "edge_ngram_tokenizer"
},
"edge_ngram_search_analyzer": {
"tokenizer": "lowercase"
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 5,
"token_chars": [
"letter"
]
}
}
}
}
},
"mappings": {
field: {
"properties": {
"name": {
"type": "text",
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"edgengram": {
"type": "text",
"analyzer": "edge_ngram_analyzer",
"search_analyzer": "edge_ngram_search_analyzer"
},
"completion": {
"type": "completion"
}
},
"analyzer": "standard"
}
}
}
}
}

我发现可以将原始机制修改为:
    for field in model.__searchable__:
temp = getattr(model, field)
fields[field] = {"properties": {
"type": "text",
"fields": {
"keywordstring": {
"type": "text",
"analyzer": "keyword_analyzer"
},
"edgengram": {
"type": "text",
"analyzer": "edge_ngram_analyzer",
"search_analyzer": "edge_ngram_search_analyzer"
},
"completion": {
"type": "completion"
}
},
"analyzer": "standard"
}}
payload = {
"settings": {
"index": {
"analysis": {
"filter": {},
"analyzer": {
"keyword_analyzer": {
"filter": [
"lowercase",
"asciifolding",
"trim"
],
"char_filter": [],
"type": "custom",
"tokenizer": "keyword"
},
"edge_ngram_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "edge_ngram_tokenizer"
},
"edge_ngram_search_analyzer": {
"tokenizer": "lowercase"
}
},
"tokenizer": {
"edge_ngram_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 5,
"token_chars": [
"letter"
]
}
}
}
}
},
"mappings": fields
}

但这就是我迷路的地方。我应该在此文档中将实际内容(temp = getattr(model,field))放在哪里,以便整个工作正常进行?我找不到任何示例或文档的相关部分来涵盖使用稍微复杂一些的映射来更新索引等等,这是否正确/可行?我看到的每本指南都涵盖了批量索引编制,但以某种方式我无法建立连接。

最佳答案

我认为您有点被误解,让我尝试解释一下。您想要添加一个具有 flex 的文档:

current_app.elasticsearch.index(index=index, id=model.id, body=payload)



哪个正在使用elasticsearch-py lib中定义的index()方法
在此处查看示例:
https://elasticsearch-py.readthedocs.io/en/master/index.html#example-usage
正文必须是您的文档的简单字典,如文档示例中所示。

所设置的是索引的设置,这是不同的。类似于数据库,您可以在文档内部设置表的架构。

若要设置设置(如果要设置给定的设置),则需要使用put_settings(如此处定义):
https://elasticsearch-py.readthedocs.io/en/master/api.html?highlight=settings#elasticsearch.client.ClusterClient.put_settings

希望对您有帮助。

关于elasticsearch - 通过elasticsearch.index, body 结构和映射向elasticsearch添加新文档,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58332727/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com