gpt4 book ai didi

MySQL 查询在大数据上非常慢

转载 作者:行者123 更新时间:2023-11-29 01:43:43 25 4
gpt4 key购买 nike

我不是 MySQL 高手,但我明白了,我刚刚继承了一个非常大的表(600,000 行和大约 90 列(请杀了我...))并且我创建了一个较小的表来将其与类别表链接。

我正在尝试使用左连接查询所述表,因此我在一个对象中拥有两组数据,但它运行得非常慢,而且我还不够热,无法将其整理出来;如果能提供一点指导和解释为什么这么慢,我将不胜感激。

SELECT 
`products`.`Product_number`,
`products`.`Price`,
`products`.`Previous_Price_1`,
`products`.`Previous_Price_2`,
`products`.`Product_number`,
`products`.`AverageOverallRating`,
`products`.`Name`,
`products`.`Brand_description`
FROM `product_categories`
LEFT OUTER JOIN `products`
ON `products`.`product_id`= `product_categories`.`product_id`
WHERE COALESCE(product_categories.cat4, product_categories.cat3,
product_categories.cat2, product_categories.cat1) = '123456'
AND `product_categories`.`product_id` != 0

这两个表是 MyISAM,products 表在 Product_number 和 Brand_Description 上有索引,product_categories 表在所有列的组合上有唯一索引;如果此信息有任何帮助。

继承了这个系统后,我需要尽快让它工作,然后再用核武器正确地完成它,所以现在的任何帮助都会赢得我最大的尊重!

[编辑]这是解释扩展的输出:

+----+-------------+--------------------+-------+---------------+------+---------+------+---------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------------+-------+---------------+------+---------+------+---------+----------+--------------------------+
| 1 | SIMPLE | product_categories | index | NULL | cat1 | 23 | NULL | 1224419 | 100.00 | Using where; Using index |
| 1 | SIMPLE | products | ALL | Product_id | NULL | NULL | NULL | 512376 | 100.00 | |
+----+-------------+--------------------+-------+---------------+------+---------+------+---------+----------+--------------------------+

最佳答案

优化表格

要建立基线,我首先建议运行 OPTIMIZE TABLE两个表上的命令。请注意,这可能需要一些时间。来自docs :

OPTIMIZE TABLE should be used if you have deleted a large part of a table or if you have made many changes to a table with variable-length rows (tables that have VARCHAR, VARBINARY, BLOB, or TEXT columns). Deleted rows are maintained in a linked list and subsequent INSERT operations reuse old row positions. You can use OPTIMIZE TABLE to reclaim the unused space and to defragment the data file. After extensive changes to a table, this statement may also improve performance of statements that use the table, sometimes significantly.

[...]

For MyISAM tables, OPTIMIZE TABLE works as follows:

  1. If the table has deleted or split rows, repair the table.

  2. If the index pages are not sorted, sort them.

  3. If the table's statistics are not up to date (and the repair could not be accomplished by sorting the index), update them.

索引

如果空间和索引管理不是问题,您可以尝试添加 composite index

product_categories.cat4, product_categories.cat3, product_categories.cat2, product_categories.cat1

如果您在查询中经常使用这些列的最左边的子集,建议这样做。查询计划表明它可以使用 product_categoriescat1 索引。这很可能只包括 cat1 列。通过将所有四个类别列添加到索引,它可以更有效地查找所需的行。来自docs :

MySQL can use multiple-column indexes for queries that test all the columns in the index, or queries that test just the first column, the first two columns, the first three columns, and so on. If you specify the columns in the right order in the index definition, a single composite index can speed up several kinds of queries on the same table.

结构

此外,鉴于您的表有 90 列,您还应该注意 a wider table can lead to slower query performance .您可能需要考虑 Vertically Partitioning你的表分成多个表:

Having too many columns can bloat your record size, which in turn results in more memory blocks being read in and out of memory causing higher I/O. This can hurt performance. One way to combat this is to split your tables into smaller more independent tables with smaller cardinalities than the original. This should now allow for a better Blocking Factor (as defined above) which means less I/O and faster performance. This process of breaking apart the table like this is a called a Vertical Partition.

关于MySQL 查询在大数据上非常慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12884263/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com