从Spring Jpa存储库中批量处理分页数据流

问题描述

我有一个拥有超过300万个条目的数据库。一旦我开始处理这些条目，就有可能在数据库中添加新条目。我能够使用SimpleJpaRepository方法findAll生成数据页面，该方法采用一个规范和大小为1000的PageRequest。

现在，我每隔1000个条目停止一次，并返回1000个订单的列表，但我想继续前进到下一个1000个条目，以将整个结果集更新为连续的数据流。

是否有可能在任何给定的时间点一次分批处理Java流中的10000个条目，然后从内存中丢弃之前的10000个并等待直到流再次变为10000阈值？

因此，尽管数据库调用以1000个提取订单实体的形式进行批量处理，但我希望将结果以Java代码的形式最多传输10000行并进行处理。

认为1000是从数据库中获取的一批数据，而最多10000是作为Java流加载到内存中的那一批数据。

随时编辑问题。

  @Test
  void shouldReturnPageOfData_whenAskingForOrderTotalsLessthan_400() {
    Specification<OrderEntity> orderSpecification = OrderSpecification
        .orderTotalLessthan(BigDecimal.valueOf(400));
    Pageable paging = PageRequest.of(0,1000);
    Page<OrderEntity> orderEntities = postgresBigOrderRepository
        .findAll(orderSpecification,paging);
// I want orderEntities.stream() to be able to be filled upto 10 batches of my DB Data on the fly as the below lines executes
    List<BigDecimal> orderTotals = orderEntities.stream()
        .map(orderEntity -> orderEntity.getordertotal())
        .collect(Collectors.toList());

    // Even if PageRequest has 1000 entries,is it possible for orderTotals to be 1500 in case the next page of DB data has 500 entries


    System.out.println(orderTotals.get(0));
    System.out.println(orderTotals.get(orderTotals.size() - 1));
    assertthat(orderTotals.size(),is(1000));
  }

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

bigdata java java java-stream stream