gpt4 book ai didi

mysql - MariaDB JDBC 驱动程序与 SQL Server 相比无法有效地批量更新

转载 作者:行者123 更新时间:2023-11-29 17:33:25 25 4
gpt4 key购买 nike

我对我的应用程序中插入/更新/删除例程的性能进行了基准测试,我正在将其从 SQL Server 移植到 MariaDB。

  • 本地 Win10 工作站上的 Java 1.8,配备 i7 2.80GHz CPU + 16GB RAM
  • JDBC org.mariadb.jdbc:mariadb-java-client:2.2.4
  • 10.2.12-MariaDB-log AWS 上的 MariaDB 服务器

基准测试触发了 50,000 次插入、相同的更新和删除。

SQL Server 通过 net.sourceforge.jtds JDBC 驱动程序在 1 秒内处理所有这些。

使用 MariaDB-java-client 驱动程序的 MariaDB 插入速度更快,但更新(和删除)速度慢得多,需要 3.5 秒。

两个数据库中的架构相同,我认为由于 MariaDB 中的插入速度很快,这可能排除了索引问题或服务器配置错误。

我尝试了 JDBC 连接字符串的多种变体,最终发现这是最快的:

  ?verifyServerCertificate=true\
&useSSL=true\
&requireSSL=true\
&allowMultiQueries=true\
&cachePrepStmts=true\
&cacheResultSetMetadata=true\
&cacheServerConfiguration=true\
&elideSetAutoCommits=true\
&maintainTimeStats=false\
&prepStmtCacheSize=50000\
&prepStmtCacheSqlLimit=204800\
&rewriteBatchedStatements=false\
&useBatchMultiSend=true\
&useBatchMultiSendNumber=50000\
&useBulkStmts=true\
&useLocalSessionState=true\
&useLocalTransactionState=true\
&useServerPrepStmts=true

在所有情况下,mysql 和 mysql-connectorj 的性能都比 mariadb 差。

我已经研究这个问题一周了,正在考虑采用我之前的问题 How do I increase the speed of a large series of UPDATEs in mySQL vs SQL Server? 中建议的解决方法。

以防万一,这可能是服务器配置错误,以下是我得到的关键变量:

key_buffer_size                16MB
innodb_buffer_pool_size 24GB (mem 30GB)
innodb_log_file_size 134MB
innodb_log_buffer_size 8MB
innodb_flush_log_at_trx_commit 0
max_allowed_packet 16MB

我的 50,000 次写入只是很小的数据量 - 大约 2MB。但对于 SQL 语法,当它通过 JDBC 连接时,这可能要大 10 倍 - 这是正确的吗?

这是 SQL 和解释计划:

Describe `data`
+---------------+------------------+------+-----+---------------------+-------------------------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+------------------+------+-----+---------------------+-------------------------------+
| parentId | int(10) unsigned | NO | PRI | NULL | |
| modifiedDate | date | NO | PRI | NULL | |
| valueDate | date | NO | PRI | NULL | |
| value | float | NO | | NULL | |
| versionstamp | int(10) unsigned | NO | | 1 | |
| createdDate | datetime | YES | | current_timestamp() | |
| last_modified | datetime | YES | | NULL | on update current_timestamp() |
+---------------+------------------+------+-----+---------------------+-------------------------------+

INSERT INTO `data` (`value`, `parentId`, `modifiedDate`, `valueDate`) VALUES (4853.16314229298,52054,'20-Apr-18','28-Dec-18')

+------+-------------+-------+------+---------------+------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+------+---------------+------+---------+------+------+-------+
| 1 | INSERT | data | ALL | NULL | NULL | NULL | NULL | NULL | NULL |
+------+-------------+-------+------+---------------+------+---------+------+------+-------+



UPDATE `data` SET `value` = 4853.16314229298 WHERE `parentId` = 52054 AND `modifiedDate` = '20-Apr-18' AND `valueDate` = '28-Dec-18'

+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | data | range | PRIMARY | PRIMARY | 10 | NULL | 1 | Using where |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+


DELETE FROM `data` WHERE `parentId` = 52054 AND `modifiedDate` = '20-Apr-18' AND `valueDate` = '29-Jan-16'

+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | data | range | PRIMARY | PRIMARY | 10 | NULL | 1 | Using where |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+

[更新]

JDBC 用法 - 这是一个精简版本,因此请原谅任何严重错误:

    final Connection connection = dataSource.getConnection();
connection.setAutoCommit(false);
try (PreparedStatement statement = connection.prepareStatement(
"UPDATE data SET value = ? " +
"WHERE parentId = ? " +
"AND modifiedDate = ? " +
"AND valueDate = ? ")) {
// timeSeries is a list of 50,000 data points
Arrays.stream(timeSeries)
.forEach(ts -> {
try {
statement.setDouble(1, value);
statement.setLong(2, parentId);
statement.setDate(3, new java.sql.Date(
modifiedDate.getTime()));
statement.setDate(4, new java.sql.Date(
valueDate.getTime()));
statement.addBatch();
} catch (SQLException e) {
throw new RuntimeException(
"Bad batch statement handling", e);
}
});
int[] results = statement.executeBatch();
connection.commit();
} catch (SQLException e) {
connection.rollback();
throw e;
} finally {
connection.close();
}

我还从 General_log 中获取了一些数据,显示传入的 JDBC 调用,它看起来非常基本 - 用于设置语句的“准备”调用,然后进行单独更新。

这让我感到惊讶 - 似乎没有批处理:

13/06/2018 15:09    service_user_t[service_user_t] @  [9.177.2.31]  75954   298206495   Query   set autocommit=0
13/06/2018 15:09 service_user_t[service_user_t] @ [9.177.2.31] 75954 298206495 Prepare UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
13/06/2018 15:09 service_user_t[service_user_t] @ [9.177.2.31] 75954 298206495 Execute UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
13/06/2018 15:09 service_user_t[service_user_t] @ [9.177.2.31] 75954 298206495 Execute UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
13/06/2018 15:09 service_user_t[service_user_t] @ [9.177.2.31] 75954 298206495 Execute UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
13/06/2018 15:09 service_user_t[service_user_t] @ [9.177.2.31] 75954 298206495 Execute UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
13/06/2018 15:09 service_user_t[service_user_t] @ [9.177.2.31] 75954 298206495 Execute UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
etc
etc

最佳答案

在批处理中的某些行之间添加“开始”和“提交”语句。或者在批处理之前启动事务,然后提交。这比数千个单独的声明要快得多。

如果您只执行插入操作,则 rewriteBatchStatements=true 应该可以显着加快速度,而无需进行事务。另外,您还可以将 max_packet_size 增加到 1GB,这将执行更多批处理,也许您的整个批处理将转换为 1 个非常大的多插入。

关于mysql - MariaDB JDBC 驱动程序与 SQL Server 相比无法有效地批量更新,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50473450/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com