gpt4 book ai didi

mysql - 使用 Sean lahman MLB 数据库了解 mysql 查询性能

转载 作者:行者123 更新时间:2023-11-30 00:38:45 25 4
gpt4 key购买 nike

我最近下载了sean lahman sql 并将数据导入到 mysql 数据库中,然后开始进行一些查询。我的 SQL 知识非常贫乏;基本的内部联接和简单的子查询从未真正超出这一范围。但这是一个非常酷的数据集,我立即开始遇到一些我不太理解的性能问题。

以下查询通过加入击球和经理表返回西雅图水手队 HR 前 5 名击球手的球员 ID、一些进攻统计数据和经理 ID:

select 
b.playerID, b.yearID, b.H, b.HR, b.RBI, (b.H / b.AB) b_avg, mgr.managerID
from
Batting b
inner join
Managers mgr on b.yearID = mgr.yearID and b.teamID = mgr.teamID
where b.teamID = 'SEA'
order by b.HR desc
limit 5
-> ;
+-----------+--------+------+------+------+--------+------------+
| playerID | yearID | H | HR | RBI | b_avg | managerID |
+-----------+--------+------+------+------+--------+------------+
| griffke02 | 1997 | 185 | 56 | 147 | 0.3043 | pinielo01m |
| griffke02 | 1998 | 180 | 56 | 146 | 0.2844 | pinielo01m |
| griffke02 | 1996 | 165 | 49 | 140 | 0.3028 | pinielo01m |
| griffke02 | 1999 | 173 | 48 | 134 | 0.2855 | pinielo01m |
| griffke02 | 1993 | 180 | 45 | 109 | 0.3093 | pinielo01m |
+-----------+--------+------+------+------+--------+------------+
5 rows in set (0.11 sec)

返回得很快(0.11 秒)。但是当我尝试获取球员和经理的全名时,查询速度急剧下降:

select 
mp.nameLast plyr_first, mp.nameFirst plyr_last, b.yearID, b.H, b.HR, b.RBI, (b.H / b.AB) b_avg, mm.nameLast mgr_last, mm.nameFirst mgr_lfirst
from
Batting b
inner join
Managers mgr
on b.yearID = mgr.yearID and b.teamID = mgr.teamID
inner join
Master mp
on b.playerID = mp.playerID
inner join
Master mm on mgr.managerID = mm.managerID
where
b.teamID = 'SEA'
order by
b.HR desc limit 5;
+------------+-----------+--------+------+------+------+--------+----------+------------ +
| plyr_first | plyr_last | yearID | H | HR | RBI | b_avg | mgr_last | mgr_lfirst |
+------------+-----------+--------+------+------+------+--------+----------+------------ +
| Griffey | Ken | 1997 | 185 | 56 | 147 | 0.3043 | Piniella | Lou |
| Griffey | Ken | 1998 | 180 | 56 | 146 | 0.2844 | Piniella | Lou |
| Griffey | Ken | 1996 | 165 | 49 | 140 | 0.3028 | Piniella | Lou |
| Griffey | Ken | 1999 | 173 | 48 | 134 | 0.2855 | Piniella | Lou |
| Griffey | Ken | 1993 | 180 | 45 | 109 | 0.3093 | Piniella | Lou |
+------------+-----------+--------+------+------+------+--------+----------+------------ +
5 rows in set (11.43 sec)

这是主表上的相关行(排除了很多行,但这些是主要行)

+--------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+-------+
| lahmanID | int(11) | NO | PRI | NULL | |
| playerID | varchar(10) | YES | | NULL | |
| managerID | varchar(10) | YES | | NULL | |
| nameFirst | varchar(50) | YES | | NULL | |
| nameLast | varchar(50) | YES | | NULL | |

我基本上从击球表开始,因为那是数据所在的地方。然后我添加到 Managers 表中,仍然得到了很好的结果。然后我加入了主表并获取了玩家的名字和姓氏,这还不错,但主表的第二次连接给我带来了问题。

当我修改查询以仅返回经理 ID,而不返回经理的名字和姓氏时,速度要快得多,大约四分之一秒。关于如何获得具有良好性能的球员和经理的名字/姓氏的任何想法,您能给我指出如何减慢查询速度的正确方向吗?

谢谢,基点

最佳答案

这可能不正确,但您可以尝试将经理姓名的连接更改为:

inner join Master mm
on mgr.managerID = mm.playerID

所以你会运行:

select mp.nameLast plyr_first,
mp.nameFirst plyr_last,
b.yearID,
b.H,
b.HR,
b.RBI,
(b.H / b.AB) b_avg,
mm.nameLast mgr_last,
mm.nameFirst mgr_lfirst
from Batting b
inner join Managers mgr
on b.yearID = mgr.yearID
and b.teamID = mgr.teamID
inner join Master mp
on b.playerID = mp.playerID
inner join Master mm
on mgr.managerID = mm.playerID
where b.teamID = 'SEA'
order by b.HR desc limit 5;

我只是想排除错误连接的原因。如果它不起作用,您可以从查询中取出“限制 5”并查看是否有任何行重复和/或其他“错误”?

关于mysql - 使用 Sean lahman MLB 数据库了解 mysql 查询性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21959663/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com