mariadb - 表大小——MariaDB Columnstore 与 InnoDB-6ren

mariadb - 表大小——MariaDB Columnstore 与 InnoDB

转载作者：行者123 更新时间：2023-12-05 09:17:11

我在 MariaDB 的 ColumnStore 上发现的每一项分析都声称它使用的磁盘空间比 InnoDB 等常规引擎少，例如:https://www.percona.com/blog/2017/03/17/column-store-database-benchmarks-mariadb-columnstore-vs-clickhouse-vs-apache-spark/

但这不是我在测试中发现的

CREATE TABLE `innodb_test` (id int, value1 bigint, value2 bigint, value3 bigint, value4 bigint, value5 bigint) ENGINE=innodb;

CREATE TABLE `columnstore_test` (id int COMMENT 'compression=2', value1 bigint COMMENT 'compression=2', value2 bigint COMMENT 'compression=2', value3 bigint COMMENT 'compression=2', value4 bigint COMMENT 'compression=2',value5 bigint COMMENT 'compression=2') ENGINE=columnstore;

向表中插入值为 0 的 100 万行(5 列):

INSERT INTO innodb_test
SELECT CONCAT(a1.id,a2.id,a3.id,a4.id,a5.id,a6.id),
0,0,0,0,0
from 
  (select 0 as id union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) a1, 
  (select 0 as id union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) a2,
  (select 0 as id union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) a3,
  (select 0 as id union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) a4,
  (select 0 as id union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) a5,
  (select 0 as id union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) a6;

INSERT INTO columnstore_test SELECT * FROM innodb_test;

columnstore 表的大小比 innoDB 表大:

call columnstore_info.table_usage(NULL, 'columnstore_test');
+--------------+------------------+-----------------+-----------------+-------------+
| TABLE_SCHEMA | TABLE_NAME       | DATA_DISK_USAGE | DICT_DISK_USAGE | TOTAL_USAGE |
+--------------+------------------+-----------------+-----------------+-------------+
| size_comp    | columnstore_test | 352.05 MB       | 0 Bytes         | 0 Bytes     |
+--------------+------------------+-----------------+-----------------+-------------+

SELECT table_name, (data_length + index_length) / (1024 * 1024) "Size in MB"  FROM information_schema.tables WHERE table_schema = schema() AND table_name = 'innodb_test';
+-------------+------------+
| table_name  | Size in MB |
+-------------+------------+
| innodb_test | 71.6094    |
+-------------+------------+

此外，如果我创建没有压缩的表，大小是相同的:

CREATE TABLE `columnstore_no_compression` (id int COMMENT 'compression=0', value1 bigint COMMENT 'compression=0', value2 bigint COMMENT 'compression=0', value3 bigint COMMENT 'compression=0', value4 bigint COMMENT 'compression=0',value5 bigint COMMENT 'compression=0') ENGINE=columnstore;

INSERT INTO columnstore_no_compression SELECT * FROM innodb_test;

call columnstore_info.table_usage(NULL, 'columnstore_no_compression');
+--------------+----------------------------+-----------------+-----------------+-------------+
| TABLE_SCHEMA | TABLE_NAME                 | DATA_DISK_USAGE | DICT_DISK_USAGE | TOTAL_USAGE |
+--------------+----------------------------+-----------------+-----------------+-------------+
| size_comp    | columnstore_no_compression | 352.00 MB       | 0 Bytes         | 0 Bytes     |
+--------------+----------------------------+-----------------+-----------------+-------------+

我使用的是 mariadb-columnstore-1.1.2-1 版本

我的.ini文件:

[client]
port = 3306
socket          = /usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock

[mysqld]
loose-server_audit_syslog_info = columnstore-1
port = 3306
socket          = /usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock
datadir         = /ssd/mariadb/db
skip-external-locking
key_buffer_size = 512M
max_allowed_packet = 1M
table_cache = 512
sort_buffer_size = 4M
read_buffer_size = 4M
read_rnd_buffer_size = 16M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size = 0
thread_stack = 512K
lower_case_table_names=1
group_concat_max_len=512
sql_mode="ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"
infinidb_compression_type=2
infinidb_stringtable_threshold=20
infinidb_local_query=0
infinidb_diskjoin_smallsidelimit=0
infinidb_diskjoin_largesidelimit=0
infinidb_diskjoin_bucketsize=100
infinidb_um_mem_limit=0
infinidb_use_import_for_batchinsert=1
infinidb_import_for_batchinsert_delimiter=7
basedir                         = /usr/local/mariadb/columnstore/mysql/
character-sets-dir              = /usr/local/mariadb/columnstore/mysql/share/charsets/
lc-messages-dir                 = /usr/local/mariadb/columnstore/mysql/share/
plugin_dir                      = /usr/local/mariadb/columnstore/mysql/lib/plugin
binlog_format=ROW
server-id = 1
log-bin=/usr/local/mariadb/columnstore/mysql/db/mysql-bin
relay-log=/usr/local/mariadb/columnstore/mysql/db/relay-bin
relay-log-index = /usr/local/mariadb/columnstore/mysql/db/relay-bin.index
relay-log-info-file = /usr/local/mariadb/columnstore/mysql/db/relay-bin.info
tmpdir          = /ssd/tmp/

[mysqldump]
quick
max_allowed_packet = 16M

[mysql]
no-auto-rehash

[isamchk]
key_buffer_size = 256M
sort_buffer_size = 256M
read_buffer = 2M
write_buffer = 2M

[myisamchk]
key_buffer_size = 256M
sort_buffer_size = 256M
read_buffer = 2M
write_buffer = 2M

[mysqlhotcopy]
interactive-timeout

这是预期的行为还是我做错了什么？

最佳答案

我是 MariaDB ColumnStore 的首席软件工程师。

ColumnStore 针对大型数据集进行了优化，并为列预分配了磁盘空间。这样做的好处是在磁盘轴上碎片的可能性较小。缺点是在像您这样的小数据集上，它分配了很多未使用的空间。

它首先为第一列范围预分配 256KB，然后将其扩展到 2^23 行(刚刚超过 800 万)。因此，对于您的每个 BIGINT 列，它将预分配 64MB，对于您的 INT，它将预分配 32MB。对于压缩文件上的 header block ，压缩/未压缩之间的细微差别。我们有一些 information_schema 表可以向您显示实际使用情况(8KB 以内):

https://mariadb.com/kb/en/library/columnstore-information-schema-tables/

因此，除非您计划使用更大的数据集(至少在几 GB 的范围内)，否则不幸的是您会在数据很少时看到大量磁盘使用。

关于mariadb - 表大小——MariaDB Columnstore 与 InnoDB，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/48895502/

文章推荐： python - 如何使用 sklearn python 预测 future 的数据帧？

文章推荐： Weka 3.8 包安装 : What are the steps to add id3?

文章推荐： php - Magento 2 的产品图片和缩略图显示行为

文章推荐： odoo-11 - 如何在 XML Odoo V11 中获取当前(登录)用户？

mysql - MariaDB/Columnstore 引擎内存阻塞
我们已经安装了 mariadb 和 columnstore 引擎，从过去几周开始，我们面临内存阻塞问题，内存阻塞和我们所有的 DML/DDL 操作都卡住了，在重新启动服务后它得到修复。 below a
mysql - Mariadb columnstore 自动增量不起作用
我正在尝试使用 Mariadb 列存储，但在将数据插入到定义了自动增量列的表中时遇到了一些错误。使用 JDBC 驱动程序时会出现此问题。 CREATE TABLE schema.mytable (
mariadb - 表大小——MariaDB Columnstore 与 InnoDB
我在 MariaDB 的 ColumnStore 上发现的每一项分析都声称它使用的磁盘空间比 InnoDB 等常规引擎少，例如:https://www.percona.com/blog/2017/03
sql - 使用 Clustered ColumnStore Index 插入唯一值
我想知道我将如何创建我的表或选择一个插入以确保我不会得到重复的值。 create table test.dbo.test product, time, primary key(product, tim
columnstore - 在 clickhouse 中是否可以直接通过插入查询存储 HyperLogLog/uniqState() 状态？
我们可以使用 AggregatedMergeTree 表引擎，它可用于聚合行。通常在聚合数据中，我们对存储所有唯一标识符不感兴趣，但仍希望进行不同的计数。我们仍然希望能够进行另一次聚合以在之后获得这
sql-server - SQL Server columnstore 索引更新/插入存储过程
我在测试 sql server 2012 的 columnstore 索引功能时很开心。因为你不能更新/插入带有此类索引的表，所以我阅读了一些选项:保留一个单独的表并为每个批量插入或使用一个新分区禁用
sql-server - 按 Clustered Columnstore Index 中的索引排序
我有一个在 DW 中使用的表。这将非常适合Clustered ColumnStore Indexing。但是，我真的很想最大限度地提高查询的性能，因此我想制定一个特定的聚类顺序，因为我知道我的大部分
sql-server - Clustered Columnstore 上的 Rowstore 索引 - 基数估计错误？
这个让我难住了。我有一个维度表，其中包含大约 3000 万行。它是一个聚集列存储。此外，此表在其代理键上具有 INT 类型的主键约束。检索代理键的 MIN() 的查询，对于给定的日期范围，如下所示:
java - SqlExceptionHelper : Cursors are not supported on a table which has a clustered columnstore index
我正在尝试通过 flume 将使用聚集列存储索引的 DWH SQL Server 表中的数据导入到 kudu 中。但是，在我的自定义 flume 源从数据库中检索一定数量的行后，出现以下异常: Sql
sql-server - SQL Server 2012 : performance columnstore index vs B-tree
列存储索引的好处之一是单个列的数据“在磁盘上彼此相邻”存储。这代表更好的压缩和更快的读取时间。然而;当使用 B 树(常规的非聚集索引)时，那棵树的叶子不是数据本身吗？因此，当我在 A 列上创建索引时

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

mariadb - 表大小——MariaDB Columnstore 与 InnoDB