gpt4 book ai didi

java - Elasticsearch - Java RestHighLevelClient - 如何使用滚动 api 获取所有文档

转载 作者:行者123 更新时间:2023-12-02 22:43:14 25 4
gpt4 key购买 nike

在我的 Elasticsearch 索引中,我保存了大约 30000 个实体。我想使用 RestHighLevelClient 获取它们的所有 id。我读过最好的方法是使用滚动 api。但是,当我这样做时,我只收到大约 10 个实体而不是 30k。如何解决这个问题

final class ElasticRepo {
private final RestHighLevelClient restHighLevelClient;

List<ListingsData> getAllListingsDataIds() {
val request = new SearchRequest(ELASTICSEARCH_LISTINGS_INDEX);
request.types(ELASTICSEARCH_TYPE);
val searchSourceBuilder = new SearchSourceBuilder()
.query(matchAllQuery())
.fetchSource(new String[]{"listing_id"}, new String[]{"backoffice_data", "search_and_match_data"});
request.source(searchSourceBuilder);
request.scroll(TimeValue.timeValueMinutes(3));
return executeQuery(request);
}

private List<ListingsData> executeQuery(final SearchRequest searchQuery) {
try {
val hits = restHighLevelClient.search(searchQuery, RequestOptions.DEFAULT).getHits().getHits();
return Arrays.stream(hits).map(SearchHit::getSourceAsString).map(ElasticRepo::toListingsData).collect(Collectors.toList());
} catch (IOException e) {
e.printStackTrace();
throw new RuntimeException("");
}
}

}

当我这样做时,executeQuery 只返回大约 11 个实体。如何解决,如何获取索引中的所有文档?

最佳答案

尝试按照这个例子,我正在使用这个代码并且它有效:

        String query = "your query here";

QueryBuilder matchQueryBuilder = QueryBuilders.boolQuery().must(new QueryStringQueryBuilder(query));

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

searchSourceBuilder.query(matchQueryBuilder);

searchSourceBuilder.size(5000); //max is 10000

searchRequest.indices("your index here");

searchRequest.source(searchSourceBuilder);

final Scroll scroll = new Scroll(TimeValue.timeValueMinutes(10L));

searchRequest.scroll(scroll);

SearchResponse searchResponse = client.search(searchRequest);
String scrollId = searchResponse.getScrollId();

SearchHit[] allHits = new SearchHit[0];

SearchHit[] searchHits = searchResponse.getHits().getHits();

while (searchHits != null && searchHits.length > 0)
{

allHits = Helper.concatenate(allHits, searchResponse.getHits().getHits()); //create a function which concatenate two arrays

SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId);

scrollRequest.scroll(scroll);

searchResponse = client.searchScroll(scrollRequest);

scrollId = searchResponse.getScrollId();

searchHits = searchResponse.getHits().getHits();

}

ClearScrollRequest clearScrollRequest = new ClearScrollRequest();
clearScrollRequest.addScrollId(scrollId);
ClearScrollResponse clearScrollResponse = client.clearScroll(clearScrollRequest);

关于java - Elasticsearch - Java RestHighLevelClient - 如何使用滚动 api 获取所有文档,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53539363/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com