gpt4 book ai didi

elasticsearch - 获取 RequestError(400, 'search_phase_execution_exception' , 'runtime error' ) 以获得相似性

转载 作者:行者123 更新时间:2023-12-05 03:42:20 56 4
gpt4 key购买 nike

我正在尝试使用 tensorflow_hub 通过 Elasticsearch 进行语义搜索,但我得到了 RequestError: RequestError(400, 'search_phase_execution_exception', 'runtime error') 。从 search_phase_execution_exception 我假设数据损坏(from this stack question)我的文档结构看起来像这样

{
"settings": {
"number_of_shards": 2,
"number_of_replicas": 1
},
"mappings": {
"dynamic": "true",
"_source": {
"enabled": "true"
},
"properties": {
"id": {
"type":"keyword"
},
"title": {
"type": "text"
},
"abstract": {
"type": "text"
},
"abs_emb": {
"type":"dense_vector",
"dims":512
},
"timestamp": {
"type":"date"
}
}
}
}

然后我使用 elasticsearch.indices.create 创建了一个文档。

es.indices.create(index=index, body='my_document_structure')
res = es.indices.delete(index=index, ignore=[404])
for i in range(100):
doc = {
'timestamp': datetime.datetime.utcnow(),
'id':id[i],
'title':title[0][i],
'abstract':abstract[0][i],
'abs_emb':tf_hub_KerasLayer([abstract[0][i]])[0]
}
res = es.index(index=index, body=doc)

我使用这段代码进行语义搜索

查询=“ Graphite 烯”查询向量 = 列表(嵌入([查询])[0])

script_query = {
"script_score": {
"query": {"match_all": {}},
"script": {
"source": "cosineSimilarity(params.query_vector, doc['abs_emb']) + 1.0",
"params": {"query_vector": query_vector}
}
}
}

response = es.search(
index=index,
body={
"size": 5,
"query": script_query,
"_source": {"includes": ["title", "abstract"]}
}
)

我知道在 stackoverflow 和 elsasticsearch 中有一些类似的问题,但我找不到适合我的解决方案。我的猜测是文档结构有误,但我无法弄清楚到底是什么。我使用了来自 this 的搜索查询代码 repo 。完整的报错信息太长,似乎没有包含太多信息,所以我只分享最后一部分。

~/untitled/elastic/venv/lib/python3.9/site-packages/elasticsearch/connection/base.py in 
_raise_error(self, status_code, raw_data)
320 logger.warning("Undecodable raw error response from server: %s", err)
321
--> 322 raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
323 status_code, error_message, additional_info
324 )

RequestError: RequestError(400, 'search_phase_execution_exception', 'runtime error')

这是来自 Elasticsearch 服务器的错误。

[2021-04-29T12:43:07,797][WARN ][o.e.c.r.a.DiskThresholdMonitor] 
[asmac.local] high disk watermark [90%] exceeded on
[w7lUacguTZWH9xc_lyd0kg][asmac.local][/Users/username/elasticsearch-
7.12.0/data/nodes/0] free: 17.2gb[7.4%], shards will be relocated
away from this node; currently relocating away shards totalling [0]
bytes; the node is expected to continue to exceed the high disk
watermark when these relocations are complete

最佳答案

我认为您正在点击 following issue您应该将查询更新为:

script_query = {
"script_score": {
"query": {"match_all": {}},
"script": {
"source": "cosineSimilarity(params.query_vector, 'abs_emb') + 1.0",
"params": {"query_vector": query_vector}
}
}
}

还要确保 query_vector contains floats and not doubles

关于elasticsearch - 获取 RequestError(400, 'search_phase_execution_exception' , 'runtime error' ) 以获得相似性,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67313858/

56 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com