gpt4 book ai didi

java - Elasticsearch Java 中的 MultiTermVectors

转载 作者:行者123 更新时间:2023-12-01 21:26:43 28 4
gpt4 key购买 nike

我正在使用以下函数来获取某些 ID 集的术语 vector 。

public static void builtTermVectorRequest(Client client, String index, Map<String, String> postIDs) {
TermVectorsRequest termVectorsRequest = new TermVectorsRequest();
termVectorsRequest.index(index).type("post");
for (Map.Entry<String, String> entry : postIDs.entrySet()) {
String currentPostId = entry.getKey();
String currentParentID = entry.getValue();
termVectorsRequest
.id(currentPostId)
.parent(currentParentID)
.termStatistics(true)
.selectedFields("content");
}

MultiTermVectorsRequestBuilder mtbuilder = client.prepareMultiTermVectors();
mtbuilder.add(termVectorsRequest);

MultiTermVectorsResponse response = mtbuilder.execute().actionGet();
XContentBuilder builder;
try {
builder = XContentFactory.jsonBuilder().startObject();
response.toXContent(builder, ToXContent.EMPTY_PARAMS);
builder.endObject();
System.out.println(builder.prettyPrint().string());
} catch (IOException e) {}
}

这里我有一些文档 ID 及其父 ID,因为这些文档是子文档。

我发现这些文档即使存在也没有找到。

为了确认我在 Python 中尝试了同样的事情,使用:

body = dict(docs=map(lambda x:
{
"fields": ["content"],
"_id": x["_id"],
"_routing": x["_routing"],
"term_statistics": "true"
}, result["hits"]["hits"]))

es_client = elasticsearch.Elasticsearch([{'host': '192.168.111.12', 'port': 9200}])

all_term_vectors = es_client.mtermvectors(
index="prf_test",
doc_type="post",
body=body
)

我得到了结果。

Java 代码有什么问题?

最佳答案

我尝试了更多关于如何将 TermVectorsRequestMultiTermVectorsRequestBuilder 结合使用的组合,最终得出以下有效的解决方案:

/**
* Prints term-vectors for child documents given their parent ids
*
* @param client Es client
* @param index Index name
* @param postIDs Map of child document ID to its _parent/_routing ID
*/
public static void builtTermVectorRequest(Client client, String index, Map<String, String> postIDs) {
/**
* Initialize the MultiTermVectorsRequestBuilder first
*/
MultiTermVectorsRequestBuilder multiTermVectorsRequestBuilder = client.prepareMultiTermVectors();

/**
* For every document ID, create a different TermVectorsRequest and
* add it to the MultiTermVectorsRequestBuilder created above
*/
for (Map.Entry<String, String> entry : postIDs.entrySet()) {
String currentPostId = entry.getKey();
String currentRoutingID = entry.getValue();
TermVectorsRequest termVectorsRequest = new TermVectorsRequest()
.index(index)
.type("doc_type")
.id(currentPostId)
.parent(currentRoutingID) // You can use .routing(currentRoutingID) also
.selectedFields("some_field")
.termStatistics(true);
multiTermVectorsRequestBuilder.add(termVectorsRequest);
}

/**
* Finally execute the MultiTermVectorsRequestBuilder
*/
MultiTermVectorsResponse response = multiTermVectorsRequestBuilder.execute().actionGet();

XContentBuilder builder;
try {
builder = XContentFactory.jsonBuilder().startObject();
response.toXContent(builder, ToXContent.EMPTY_PARAMS);
builder.endObject();
System.out.println(builder.prettyPrint().string());
} catch (IOException e) {
}
}

关于java - Elasticsearch Java 中的 MultiTermVectors,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38033666/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com