gpt4 book ai didi

java - 使用 Java API 从 Elasticsearch 获取所有记录

转载 作者:搜寻专家 更新时间:2023-11-01 02:04:37 25 4
gpt4 key购买 nike

我正在尝试使用 Java API 从 Elasticsearch 获取所有记录。但是我收到以下错误

n[[Wild Thing][localhost:9300][indices:data/read/search[phase/dfs]]]; nested: QueryPhaseExecutionException[Result window is too large, from + size must be less than or equal to: [10000] but was [10101].

我的代码如下

Client client;
try {
client = TransportClient.builder().build().
addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("localhost"), 9300));
int from = 1;
int to = 100;
while (from <= 131881) {
SearchResponse response = client
.prepareSearch("demo_risk_data")
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH).setFrom(from)
.setQuery(QueryBuilders.boolQuery().mustNot(QueryBuilders.termQuery("user_agent", "")))
.setSize(to).setExplain(true).execute().actionGet();
if (response.getHits().getHits().length > 0) {
for (SearchHit searchData : response.getHits().getHits()) {
JSONObject value = new JSONObject(searchData.getSource());
System.out.println(value.toString());
}
}
}
}

当前存在的记录总数是 131881 ,所以我从 from = 1 开始和 to = 100然后得到 100 条记录直到 from <= 131881 .在 Elasticsearch 中没有更多记录之前,有什么方法可以检查 get 记录,比如 100。

最佳答案

是的,您可以使用 scroll API , Java 客户端 also supports .

你可以这样做:

Client client;
try {
client = TransportClient.builder().build().
addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("localhost"), 9300));

QueryBuilder qb = QueryBuilders.boolQuery().mustNot(QueryBuilders.termQuery("user_agent", ""));
SearchResponse scrollResp = client.prepareSearch("demo_risk_data")
.addSort(SortParseElement.DOC_FIELD_NAME, SortOrder.ASC)
.setScroll(new TimeValue(60000))
.setQuery(qb)
.setSize(100).execute().actionGet();

//Scroll until no hits are returned
while (true) {
//Break condition: No hits are returned
if (scrollResp.getHits().getHits().length == 0) {
break;
}

// otherwise read results
for (SearchHit hit : scrollResp.getHits().getHits()) {
JSONObject value = new JSONObject(searchData.getSource());
System.out.println(value.toString());
}

// prepare next query
scrollResp = client.prepareSearchScroll(scrollResp.getScrollId()).setScroll(new TimeValue(60000)).execute().actionGet();
}
}

关于java - 使用 Java API 从 Elasticsearch 获取所有记录,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37669046/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com