gpt4 book ai didi

java - Presto JDBC 大型查询极慢

转载 作者:行者123 更新时间:2023-12-02 09:07:54 37 4
gpt4 key购买 nike

问题

  1. 使用 JDBC 驱动程序从 Presto 查询迭代大型 ResultSet 非常耗时。每个批处理调用似乎几乎需要 60 秒,这很可疑(可能是由于超时造成的?)。
  2. 由于某种原因,初始的executeQuery方法几​​乎花费了45秒,这是可疑的。

查询

我正在使用 Java JDBC Presto 驱动程序对 Presto 执行一个简单的查询,其本质上如下所示:

SELECT stringA, stringB 
FROM {table}
LIMIT 500000

stringAstringB 很小 - 每个大约 10 个字符。
当使用 teradata 驱动程序在 DbVisualizer 中运行时,我的查询在 10 秒内完成。
但是,当我使用 0.230 presto-jdbc 驱动程序从 Spring Java 应用程序运行相同的查询时,似乎会批量返回结果(大约 75,000 个),并且每个批处理需要一分钟多的时间才能返回。

我已经阅读了一些有关 Presto 的 targetResultSize 查询参数的内容,但我无法使用 JDBC 驱动程序/连接来设置它。我读过 presto 默认情况下一次只会返回 1MB 的数据?不确定这是否是我上述问题 #1 的原因 - 如果能弄清楚如何配置它,那就太好了。

Java 代码

public List<Object> getResultSetUsingDriverManager(ChronoLocalDate chronoLocalDate) throws SQLException {
long start = System.currentTimeMillis();

Properties properties = new Properties();
properties.setProperty("user", USERNAME);
properties.setProperty("password", PASSWORD);
properties.setProperty("SSL", "true");

final Connection connection = DriverManager.getConnection(URL, properties);

log.warn("Presto connection acquired in " + (System.currentTimeMillis() - start) + "ms");

Statement statement = connection.createStatement();

ResultSet resultSet = statement.executeQuery(query);

log.warn("Presto query executed in " + (System.currentTimeMillis() - start) + "ms");

List<Object> collection = new ArrayList<>();

int counter = 0;
long batchStart = System.currentTimeMillis();
while (resultSet.next()) {
counter++;
if (counter % 1000 == 0) {
log.warn("current count {} and took {}ms", counter, (System.currentTimeMillis() - batchStart));
batchStart = System.currentTimeMillis();
}
}
log.warn("Results extracted in " + (System.currentTimeMillis() - start));

return collection;
}

输出

2020-01-08 17:34:31.704  WARN 29368 --- ...       : Presto connection acquired in 0ms
2020-01-08 17:35:16.705 WARN 29368 --- ... : Presto query executed in 45003ms
2020-01-08 17:37:18.242 WARN 29368 --- ... : current count 1000 and took 121537ms
2020-01-08 17:37:18.244 WARN 29368 --- ... : current count 2000 and took 2ms
2020-01-08 17:37:18.245 WARN 29368 --- ... : current count 3000 and took 1ms
...
2020-01-08 17:37:18.294 WARN 29368 --- ... : current count 75000 and took 1ms
2020-01-08 17:38:18.857 WARN 29368 --- ... : current count 76000 and took 60563ms
2020-01-08 17:38:18.858 WARN 29368 --- ... : current count 77000 and took 1ms
...
2020-01-08 17:38:18.941 WARN 29368 --- ... : current count 151000 and took 0ms
2020-01-08 17:39:19.241 WARN 29368 --- ... : current count 152000 and took 60300ms
2020-01-08 17:39:19.242 WARN 29368 --- ... : current count 153000 and took 1ms
...
2020-01-08 17:39:19.311 WARN 29368 --- ... : current count 250000 and took 0ms
2020-01-08 17:39:19.311 WARN 29368 --- ... : Results extracted in 287609

版本信息

  • Java 11
  • com.facebook.presto presto-jdbc 0.230
  • Spring Boot 2.1.6.RELEASE
  • Presto 版本:302-e.3(星爆版本)

最佳答案

近几个月来,我们修复了与 Java 11 上的客户端连接相关的问题。

请将您的 JDBC 驱动程序升级到 327。

或者在客户端降级到 Java 8。

关于java - Presto JDBC 大型查询极慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59656252/

37 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com