您好,登錄后才能下訂單哦!
這篇文章主要介紹了如何使用java實現百萬級別數據導出excel的相關知識,內容詳細易懂,操作簡單快捷,具有一定借鑒價值,相信大家閱讀完這篇如何使用java實現百萬級別數據導出excel文章都會有所收獲,下面我們一起來看看吧。
在業務系統中,導出報表的需求會很常見,而隨著時間推移業務量不斷增加,數據庫的數據可能達到百萬甚至千萬級別。對于導出報表功能,最簡單的做法就是從數據庫里面把需要的數據一次性加載到內存,然后寫入excel文件,再把excel文件返回給用戶。這種做法在數據量不大的時候是可行的,但是一旦需要導出幾十萬甚至上百萬的數據,很可能出現OOM導致服務崩潰的情況,而且導出所消耗的時間會大大增加。
這里提供一種支持百萬級別數據導出的方法,并且消耗很少的內存,核心思想就是不要一次性把數據加載到內存中。
主要是從兩個方面去解決:
1.從數據庫加載數據不要一次性加載,可以分頁的方式或者用游標的方式分批加載數據,加載一批數據處理一批并且釋放內存,這樣內存占用始終處于一個比較平穩的狀態。分頁的方式加載編碼比較繁瑣,我一般是采用游標方式逐行加載。目前常用的持久層框架有JPA,mybaits,hibernate,下面會分別列出JPA,hibernate及mybatis通過游標方式加載數據。
2.寫入excel也是分批寫入,推薦阿里的EasyExcel,占用內存極低。
EasyExcel的pom依賴:
<dependency> <groupId>com.alibaba</groupId> <artifactId>easyexcel</artifactId> <version>2.1.1</version> <optional>true</optional> </dependency>
jdk1.8,idea2019,堆內存:-Xms256M -Xms256M(導出100萬數據毫無壓力),springboot,數據庫是mysql
先來張效果圖,這個是最大堆內存設置為256M,兩張表聯合查詢的情況下導出100萬數據的效果,可以看到堆內存變化比較平穩,導出100萬數據耗時143秒,這個速度還有優化的空間,如果是單表導出的話速度會更快些:
pom.xml:
<!-- spring web依賴,搭建web項目需要這個依賴--> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <!-- jpa --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-jpa</artifactId> </dependency> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <optional>true</optional> </dependency>
repository:
@Repository public interface UserRepository extends JpaRepository<UserEntity,Integer> { //@QueryHint(name = HINT_FETCH_SIZE,value = Integer.MIN_VALUE+"") 值設置為Integer.MIN_VALUE告訴mysql需要逐條返回數據,并且返回值需要用stream來接收 @QueryHints(@QueryHint(name = HINT_FETCH_SIZE,value = Integer.MIN_VALUE+"")) @Query(value = "select * from user limit 500000",nativeQuery = true) Stream<UserEntity> findAllList(); }
service:
注意:
需要加事務注解,并且是只讀事務
需要及時調用entityManager的detach方法釋放內存,不然還是會出現OOM
@Autowired private EntityManager entityManager; @Autowired private UserRepository userRepository; Transactional(readOnly = true) public void exportData3(ScrollResultsHandler<UserExportVO> scrollResultsHandler){ Stream<UserEntity> allList = userRepository.findAllList.forEach((o)->{ UserEntity userEntity = (UserEntity) o; UserExportVO userExportVO = UserExportVO.builer() .userName(userEntity.getUsername()) .mobile(userEntity.getMobile()) .build(); scrollResultsHandler.handle(userExportVO); //對象被session持有,調用detach方法釋放內存 entityManager.detach(userEntity); }); }
controller:
@RequestMapping("export4") public void export4(HttpServletResponse response) throws IOException { response.setContentType("application/vnd.ms-excel"); response.setCharacterEncoding("utf-8"); String filenames="bigdata4"; response.addHeader("Content-Disposition", "filename=" + filenames + ".xlsx"); ExcelWriter excelWriter = EasyExcel.write(response.getOutputStream(), UserExportVO.class).build(); WriteSheet[] writeSheet = new WriteSheet[] { EasyExcel.writerSheet(0, "sheet").build() }; userService.exportData(s->{ UserExportVO resultObject = s; ArrayList arrayList = new ArrayList<UserExportVO>(); arrayList.add(resultObject); excelWriter.write(arrayList, writeSheet[0]); }); excelWriter.finish(); }
使用到的相關的類:
/** * @author 奔騰的野馬 * @date 2022/04/25 09:12 */ public interface ScrollResultsHandler<T> { void handle(T t); }
import com.alibaba.excel.annotation.ExcelProperty; import lombok.AllArgsConstructor; import lombok.Builder; import lombok.Data; import lombok.NoArgsConstructor; import java.math.BigDecimal; import java.time.LocalDateTime; /** * @Author: 奔騰的野馬 * @Date: 2021/10/16 16:19 */ @Data @Builder @AllArgsConstructor @NoArgsConstructor public class UserExportVO { @ExcelProperty(value = "用戶名") private String userName; @ExcelProperty(value = "手機號") private String mobile; }
pom.xml:
<dependencies> <!-- spring web依賴,搭建web項目需要這個依賴--> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-jpa</artifactId> </dependency> <!--QueryDSL支持--> <dependency> <groupId>com.querydsl</groupId> <artifactId>querydsl-apt</artifactId> <version>5.0.0</version> <scope>provided</scope> </dependency> <!--QueryDSL支持--> <dependency> <groupId>com.querydsl</groupId> <artifactId>querydsl-jpa</artifactId> <version>5.0.0</version> </dependency> <dependency> <groupId>com.querydsl</groupId> <artifactId>querydsl-core</artifactId> <version>5.0.0</version> </dependency> </dependencies> <build> <plugins> <!-- QueryDSL 插件 --> <plugin> <groupId>com.mysema.maven</groupId> <artifactId>apt-maven-plugin</artifactId> <version>1.1.3</version> <executions> <execution> <goals> <goal>process</goal> </goals> <configuration> <outputDirectory>target/generated-sources/java</outputDirectory> <processor>com.querydsl.apt.jpa.JPAAnnotationProcessor</processor> </configuration> </execution> </executions> </plugin> </plugins> </build>
service:
@Autowired private JPAQueryFactory jpaQueryFactory; private QUserEntity qUserEntity = QUserEntity.userEntity; @Transactional(readOnly = true) public void exportData2(ScrollResultsHandler<UserExportVO> scrollResultsHandler){ //需要用stream方式接收,這樣才能逐條處理 Stream<UserExportVO> userExportVOStream = jpaQueryFactory.select(Projections.bean(UserExportVO.class , qUserEntity.userName, qUserEntity.mobile)) .from(qUserEntity) //.join(xxxEntity) //.on(xxxx) //setHint(HINT_FETCH_SIZE,Integer.MIN_VALUE+"") 告訴mysql需要逐條返回數據,注意值需要設置為Integer.MIN_VALUE才能生效 .setHint(HINT_FETCH_SIZE,Integer.MIN_VALUE+"") .limit(1000000) .stream(); userExportVOStream.forEach(dto->{ scrollResultsHandler.handle(dto); }); }
controller:
@RequestMapping("export4") public void export4(HttpServletResponse response) throws IOException { response.setContentType("application/vnd.ms-excel"); response.setCharacterEncoding("utf-8"); String filenames="bigdata4"; response.addHeader("Content-Disposition", "filename=" + filenames + ".xlsx"); ExcelWriter excelWriter = EasyExcel.write(response.getOutputStream(), UserExportVO.class).build(); WriteSheet[] writeSheet = new WriteSheet[] { EasyExcel.writerSheet(0, "sheet").build() }; userService.exportData(s->{ UserExportVO resultObject = s; ArrayList arrayList = new ArrayList<UserExportVO>(); arrayList.add(resultObject); excelWriter.write(arrayList, writeSheet[0]); }); excelWriter.finish(); }
pom.xml:
<dependency> <groupId>org.mybatis</groupId> <artifactId>mybatis</artifactId> <version>3.5.9</version> </dependency>
dao:
/** * @author 奔騰的野馬 * @date 2022/04/16 19:14 */ @Mapper public interface UserDao { //ResultSetType.TYPE_FORWORD_ONLY 結果集的游標只能向下滾動,fetchSize需要設置為Integer.MIN_VALUE游標才能生效 @Options(resultSetType = ResultSetType.FORWARD_ONLY,fetchSize = Integer.MIN_VALUE) @ResultType(UserExportVO.class) @Select("select userName,mobile from user limit 500000") void reportAll2(ResultHandler<UserExportVO> handler); }
service:
@Transactional(readOnly = true) public void export2(ResultHandler<UserExportVO> handler){ userDao.reportAll2(handler); }
controller:
同上
service:
@Autowired private EntityManager entityManager; public void exportData(ScrollResultsHandler<UserExportVO> scrollResultsHandler){ //當不需要緩存時,最好使用StatelessSession StatelessSession session = ((Session) entityManager.getDelegate()).getSessionFactory().openStatelessSession(); Query query = session.getNamedQuery("getAllList"); query.setCacheMode(CacheMode.IGNORE); //setFetchSize(Integer.MIN_VALUE)告訴mysql逐條返回數據 query.setFetchSize(Integer.MIN_VALUE); query.setFirstResult(0); query.setMaxResults(1000000); query.setReadOnly(true); query.setLockMode("a", LockMode.NONE); //ScrollMode.TYPE_FORWORD_ONLY 結果集的游標只能向下滾動 ScrollableResults results = query.scroll(ScrollMode.FORWARD_ONLY); while (results.next()) { UserEntity userEntity = (UserEntity) results.get(0); UserExportVO userExportVO = UserExportVO.builer() .userName(userEntity.getUsername()) .mobile(userEntity.getMobile()) .build(); scrollResultsHandler.handle(userExportVO); } results.close(); session.close(); }
controller:
同上
1.1 項目中使用了log4jdbc-log4j2-jdbc4.1(版本是1.16),驅動為net.sf.log4jdbc.sql.jdbcapi.DriverSpy,改成mysql的原生驅動就好了。"log4jdbc-log4j2-jdbc4.1"本來是用來開發過程中方便打印sql的,結果卻帶來了OOM問題,看來使用第三方jar包一定要慎重啊。
1.2 項目的存在多個版本的querydsl,jar包沖突,解決jar包就正常了
1.3 二次查詢時,hibernate的一級緩存沒有及時釋放,進一步分析,發現大量的對象都被緩存在(org.hibernate.engine.StatefulPersistenceContext)中,導致一級緩存泄漏
解決方法:
由于 Hibernate 的一級緩存是其內部使用的,無法關閉或停用(隨著Session 銷毀)。從Hibernate 的手冊或文檔中可知,Hibernate 的一級緩存的清除可通過以下方式:
1)對于單個對象的清除:
Session session=sessionFactory.getCurrentSession(); session.evict(entity);
2)對于實體集合的清除:
Session session=sessionFactory.getCurrentSession(); session.clear();建議在程序中加入對 Hibernate 一級緩存的清除工作,以便可以其內存數據可以及時釋放。
導出過程中遍歷stream需要二次查詢數據庫時提示"Streaming result set com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@5800daf5 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries"
錯誤詳細內容:
java.sql.SQLException: Streaming result set com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@3b8732ec is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries. at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:129) at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97) at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122) at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:953) at com.mysql.cj.jdbc.ClientPreparedStatement.executeQuery(ClientPreparedStatement.java:1003) at net.sf.log4jdbc.sql.jdbcapi.PreparedStatementSpy.executeQuery(PreparedStatementSpy.java:780) at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeQuery(ProxyPreparedStatement.java:52) at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeQuery(HikariProxyPreparedStatement.java) at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:57) at org.hibernate.loader.Loader.getResultSet(Loader.java:2292) at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:2050) at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:2012) at org.hibernate.loader.Loader.doQuery(Loader.java:953) at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:354) at org.hibernate.loader.Loader.doList(Loader.java:2838) at org.hibernate.loader.Loader.doList(Loader.java:2820) at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2652) at org.hibernate.loader.Loader.list(Loader.java:2647) at org.hibernate.loader.hql.QueryLoader.list(QueryLoader.java:506) at org.hibernate.hql.internal.ast.QueryTranslatorImpl.list(QueryTranslatorImpl.java:396) at org.hibernate.engine.query.spi.HQLQueryPlan.performList(HQLQueryPlan.java:219) at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1404) at org.hibernate.query.internal.AbstractProducedQuery.doList(AbstractProducedQuery.java:1562) at org.hibernate.query.internal.AbstractProducedQuery.list(AbstractProducedQuery.java:1530) at org.hibernate.query.internal.AbstractProducedQuery.getSingleResult(AbstractProducedQuery.java:1578) at org.hibernate.query.criteria.internal.compile.CriteriaQueryTypeQueryAdapter.getSingleResult(CriteriaQueryTypeQueryAdapter.java:111) at org.springframework.data.jpa.repository.query.JpaQueryExecution$SingleEntityExecution.doExecute(JpaQueryExecution.java:196) at org.springframework.data.jpa.repository.query.JpaQueryExecution.execute(JpaQueryExecution.java:88) at org.springframework.data.jpa.repository.query.AbstractJpaQuery.doExecute(AbstractJpaQuery.java:154) at org.springframework.data.jpa.repository.query.AbstractJpaQuery.execute(AbstractJpaQuery.java:142) at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.doInvoke(RepositoryFactorySupport.java:618) at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.invoke(RepositoryFactorySupport.java:605) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.data.projection.DefaultMethodInvokingMethodInterceptor.invoke(DefaultMethodInvokingMethodInterceptor.java:80) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:366) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:99) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:139) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.data.jpa.repository.support.CrudMethodMetadataPostProcessor$CrudMethodMetadataPopulatingMethodInterceptor.invoke(CrudMethodMetadataPostProcessor.java:149) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:95) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212) at com.sun.proxy.$Proxy223.countAllByDeliveryNo(Unknown Source)
查閱資料后發現,是mysql不支持在流式查詢過程中使用同一連接再次查詢數據庫
解決方法:
方法1.使用異步方法查詢,這樣就可以規避同一個連接二次查詢的問題
方法2.需要二次查詢時開啟一個新的事務去查詢就可以,spring中可以使用事務注解開啟新的事務就搞定了,注解如下:
@Transactional(propagation = Propagation.REQUIRES_NEW,readOnly = true)
關于“如何使用java實現百萬級別數據導出excel”這篇文章的內容就介紹到這里,感謝各位的閱讀!相信大家對“如何使用java實現百萬級別數據導出excel”知識都有一定的了解,大家如果還想學習更多知識,歡迎關注億速云行業資訊頻道。
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。