Scroll API 遗漏了一些文件

问题描述

我正在尝试使用 Scroll Api 从多个索引中获取所有文档,但它没有返回所有文档。我发现了一个类似的问题,但 op 显然缺少第一组文件。问题链接Elasticsearch Search Scroll API doesn't retrieve all the documents from an index

这是我的代码

//Code to get indexes

for (String indexName : indexNames) {
   final Scroll scroll = new Scroll(TimeValue.timeValueSeconds(45L));
   SearchRequest searchRequest = new SearchRequest(indexName);
   searchRequest.scroll(scroll);
   SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

   QueryBuilder query = QueryBuilders.boolQuery()
      .filter(QueryBuilders.termQuery(sourceId,2))
      .filter(QueryBuilders.rangeQuery(date).gte(01-05-2021).lte(31-05-2021));
   searchSourceBuilder.query(query);
   searchSourceBuilder.size(10000);
   searchRequest.source(searchSourceBuilder);

   SearchResponse searchResponse = client.search(searchRequest,RequestOptions.DEFAULT);
   String scrollId = searchResponse.getScrollId();
   SearchHit[] searchHits = searchResponse.getHits().getHits();

   List<Model> model = new ArrayList<>();      

   while(searchHits != null && searchHits.length > 0) {
      for (SearchHit document : searchHits){
         //add document to model list created above
         } //end of for loop

   // insert model list to database

   SearchScrollRequest searchScrollRequest = new SearchScrollRequest(scrollId);
   searchScrollRequest.scroll(scroll);
   searchResponse = client.scroll(searchScrollRequest,RequestOptions.DEFAULT);
   scrollId = searchResponse.getScrollId();
   searchHits = searchResponse.getHits().getHits();

   } //end of while loop

   ClearScrollRequest clear = new ClearScrollRequest();
   clear.addScrollId(scrollId);

} //end of for loop at the top

我应该得到的文件总数是 1.15 亿,但我遗漏了超过 200 万个文件。我反复检查了我的代码,但不知道我遗漏了什么。

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)