gpt4 book ai didi

java - 使用 JDBC 将大文件从 postgres 数据库流式传输到文件系统

转载 作者:行者123 更新时间:2023-11-29 13:39:48 30 4
gpt4 key购买 nike

我将文件存储在我的 postgres 数据库中类型为 bytea 的列中,其大小可能超过分配的 Java 堆空间,因此当尝试将这些文件写入文件系统时,我很快就会遇到内存问题。

我正在使用 JDBC 执行查询,然后将内容提取为二进制流。

这是我的代码的简化版本:

    public File readContent(String contentId) {
PreparedStatement statement = jdbcTemplate.getDataSource().getConnection().prepareStatement("SELECT content from table.entry WHERE id=?");
statement.setString(1, contentId);
ResultSet resultSet = statement.executeQuery();
resultSet.next();
File file = writeToFileSystem(resultSet.getBinaryStream(1));
resultSet.close();
return file;
}


private File writeToFileSystem(InputStream inputStream) {
File dir = createDirectories(Paths.get(properties.getTempFolder(), UUID.randomUUID().toString())).toFile();
File file = new File(dir, "content.zip");
FileUtils.copyInputStreamToFile(inputStream, file);
return file;
}

我的期望是,这将使我能够将数据从数据库流式传输到文件中,而无需将其完全加载到内存中。但是这种方法不起作用,因为在执行查询后我仍然得到 OutOfMemoryErrors:

Caused by: java.lang.OutOfMemoryError: Java heap space
at org.postgresql.core.PGStream.receiveTupleV3(PGStream.java:395)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2118)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:288)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:430)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:356)
at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:168)
at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:116)
at sun.reflect.GeneratedMethodAccessor201.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.tomcat.jdbc.pool.StatementFacade$StatementProxy.invoke(StatementFacade.java:114)
at com.sun.proxy.$Proxy149.executeQuery(Unknown Source)
at [...].ContentRepository.readContent(ContentRepository.java:111)

有什么方法可以将数据从数据库流式传输到文件中,而无需增加 Java 虚拟机的可用内存?

最佳答案

根据 this mail group discussion对于这个用例,你不应该使用 bytea:

There are two methods to store binary data in pg and they have differentaccess methods and performance characteristics. Bytea data is expected tobe shorter and is returned in whole with a ResultSet by the server. Forlarger data you want to use large objects which return a pointer (oid) tothe actual data which you can then stream from the server at will.

This page describes some of the differences between the two anddemonstrates using a pg specific api to access large objects, butgetBlob/setBlob will work just fine.

参见 Chapter 7. Storing Binary Data其中显示了示例代码和 Chapter 35. Large Objects详细说明:

PostgreSQL has a large object facility, which provides stream-style access to user data that is stored in a special large-object structure. Streaming access is useful when working with data values that are too large to manipulate conveniently as a whole.

关于java - 使用 JDBC 将大文件从 postgres 数据库流式传输到文件系统,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57092307/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com