Spring Data JPA批处理执行所有插入操作

问题描述

我需要在MysqL中插入很多数据(大约100k),然后尝试使用Spring Data Jpa批量插入,因为我使用的是一个简单的示例,其中包含30条记录。

第一件事是删除@GeneratedValue os,我的实体端实现了Persistable,无需在插入之前进行选择:

@Entity
public class User implements Persistable {

    @Id
    private Integer id;
    // properties here...

然后,在我的application.yml中:

spring:
  jpa:
    properties:
      hibernate.jdbc.batch_size: 30
      hibernate.generate_statistics: true
    show-sql: true
    hibernate:
      ddl-auto: validate
  datasource:
    driverClassName: com.MysqL.cj.jdbc.Driver
    url: jdbc:MysqL://localhost:3306/db?cachePrepStmts=true&reWriteBatchedInserts=true
    // user and password

我有一个简单的存储库:

public interface UserRepository extends JpaRepository<User,Integer> { }

和insert方法

public void process() {

        List<User> users = new ArrayList<>();

        for (int i = 1 ; i <= 30; i++) {
            User user = new User();
            user.setId(i);
            // set properties

            users.add(user);

            if(i % 30 == 0) {
                userRepository.saveAll(users);
                users.clear();
            }
        }
    }

然后我认为正确的方法是仅1个批处理操作,但是我有29条语句:

    1745893 nanoseconds spent acquiring 1 JDBC connections;
    0 nanoseconds spent releasing 0 JDBC connections;
    3524622 nanoseconds spent preparing 30 JDBC statements;
    68290171 nanoseconds spent executing 29 JDBC statements;
    215125391 nanoseconds spent executing 1 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    240389888 nanoseconds spent executing 1 flushes (flushing a total of 29 entities and 29 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)

任何想法?

谢谢!

解决方法

测试下一个属性:

spring.jpa.properties.hibernate.order_inserts=true