我正在尝试读取包含 700K+ 记录的 Excel 文件,并将它们批量插入 MySQL 数据库表中。
请注意,Excel 解析速度很快,我可以在 50 秒左右的时间内获取 ArrayList
中的实体对象。
我正在使用 Spring Boot 和 Spring Data JPA。
下面是我的部分 application.properties
文件:
hibernate.jdbc.batch_size=1000
spring.jpa.hibernate.use-new-id-generator-mappings=true
和我的部分实体类
:
@Entity
@Table(name = "WHT_APPS", schema = "TEST")
public class WHTApps {
@Id
@TableGenerator(name = "whtAppsGen", table = "ID_GEN", pkColumnName = "GEN_KEY", valueColumnName = "GEN_VAL")
@GeneratedValue(strategy = GenerationType.TABLE, generator = "whtAppsGen")
private Long id;
@Column(name = "VENDOR_CODE")
private int vendorCode;
.
.
.
.
下面是我的DAO
:
@Repository
@Transactional
public class JapanWHTDaoImpl implements JapanWHTDao {
@Autowired
JapanWHTAppsRepository appsRepo;
@Override
public void storeApps(List<WHTApps> whtAppsList) {
appsRepo.save(whtAppsList);
}
下面是Repository
类:
@Transactional
public interface JapanWHTAppsRepository extends JpaRepository<WHTApps, Long> {
}
有人可以告诉我我在这里做错了什么吗?
编辑:
进程未完成并最终抛出错误:-
2017-08-15 15:15:24.516 WARN 14710 --- [tp1413491716-17] o.h.engine.jdbc.spi.SqlExceptionHelper : SQL Error: 0, SQLState: 08S01
2017-08-15 15:15:24.516 ERROR 14710 --- [tp1413491716-17] o.h.engine.jdbc.spi.SqlExceptionHelper : Communications link failure
The last packet successfully received from the server was 107,472 milliseconds ago. The last packet sent successfully to the server was 107,472 milliseconds ago.
2017-08-15 15:15:24.518 INFO 14710 --- [tp1413491716-17] o.h.e.j.b.internal.AbstractBatchImpl : HHH000010: On release of batch it still contained JDBC statements
2017-08-15 15:15:24.525 WARN 14710 --- [tp1413491716-17] c.m.v.c3p0.impl.DefaultConnectionTester : SQL State '08007' of Exception tested by statusOnException() implies that the database is invalid, and the pool should refill itself with fresh Connections.
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during rollback(). Transaction resolution unknown.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.8.0_131]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[na:1.8.0_131]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_131]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_131]
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425) ~[mysql-connector-java-5.1.43.jar:5.1.43]
.
.
.
.
2017-08-15 15:15:24.526 WARN 14710 --- [tp1413491716-17] c.m.v2.c3p0.impl.NewPooledConnection : [c3p0] A PooledConnection that has already signalled a Connection error is still in use!
2017-08-15 15:15:24.527 WARN 14710 --- [tp1413491716-17] c.m.v2.c3p0.impl.NewPooledConnection : [c3p0] Another error has occurred [ com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during rollback(). Transaction resolution unknown. ] which will not be reported to listeners!
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during rollback(). Transaction resolution unknown.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.8.0_131]
谢谢
最佳答案
我还想指出一件事。问题不仅可能出在 Hibernate 上,还可能出在 DB 上。
当您在一个事务中插入 700k 个对象时,它可以存储在数据库的回滚段中等待事务提交。
如果可能的话,将逻辑拆分为在中间进行提交。
从主列表创建 1k 大小的子列表,保存子列表并在每次保存子列表后提交。
关于java - Spring data JPA批量插入很慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45687799/