gpt4 book ai didi

sql - 限制非唯一值的返回

转载 作者:行者123 更新时间:2023-11-30 21:24:27 27 4
gpt4 key购买 nike

我有两张 table 。帖子和回复。将帖子视为博客条目,而回复是评论。

我想显示 X 个帖子,然后显示每个帖子的最新三个评论。

我的回复有一个外键“post_id”,它匹配每个帖子的“id”。

我正在尝试创建一个主页,其中包含一些内容

发布 - 回复 - 回复--回复

发布--回复

以此类推第四个。我可以通过在我的模板中使用 for 循环并丢弃不需要的回复来完成此操作,但我讨厌从我不会使用的数据库中获取数据。有什么想法吗?

最佳答案

这实际上是一个非常有趣的问题。

哈哈无视这个,我很烂

编辑时:这个答案有效,但在 MySQL 上,当父行数少至 100 时,它变得非常慢。但是,请参阅下面的性能修复。

显然,您可以对每个帖子运行一次此查询:select * from comments where id = $id limit 3 这会产生大量开销,因为您最终会为每个帖子执行一个数据库查询,可怕的 N+1 个查询

如果您想一次获取所有帖子(或带有 where 的某个子集),下面的方法将令人惊讶地 工作。它假定评论具有单调递增的 ID(因为不能保证日期时间是唯一的),但允许评论 ID 在帖子之间交错。

由于 auto_increment id 列是单调递增的,如果 comment 有一个 id,一切就绪。

首先,创建这个 View 。在 View 中,我调用 post parent 和 comment child:

create view parent_top_3_children as
select a.*,
(select max(id) from child where parent_id = a.id) as maxid,
(select max(id) from child where id < maxid
and parent_id = a.id) as maxidm1,
(select max(id) from child where id < maxidm1
and parent_id = a.id) as maxidm2
from parent a;

maxidm1 只是“最大 id 减 1”; maxidm2,“最大 id 减 2”——即 在特定父 id 中第二和第三大子 id

然后从评论中加入您需要的任何 View (我将其称为 text):

select a.*, 
b.text as latest_comment,
c.text as second_latest_comment,
d.text as third_latest_comment
from parent_top_3_children a
left outer join child b on (b.id = a.maxid)
left outer join child c on (c.id = a.maxidm1)
left outer join child d on (c.id = a.maxidm2);

当然,您可以添加任何您想要的 where 子句,以限制帖子:where a.category = 'foo' 或其他。


这是我的表格的样子:

mysql> select * from parent;
+----+------+------+------+
| id | a | b | c |
+----+------+------+------+
| 1 | 1 | 1 | NULL |
| 2 | 2 | 2 | NULL |
| 3 | 3 | 3 | NULL |
+----+------+------+------+
3 rows in set (0.00 sec)

还有一部分 child 。 parent 1 没有 child :

mysql> select * from child;
+----+-----------+------+------+------+------+
| id | parent_id | a | b | c | d |
+----+-----------+------+------+------+------+

. . . .
| 18 | 3 | NULL | NULL | NULL | NULL |
| 19 | 2 | NULL | NULL | NULL | NULL |
| 20 | 2 | NULL | NULL | NULL | NULL |
| 21 | 3 | NULL | NULL | NULL | NULL |
| 22 | 2 | NULL | NULL | NULL | NULL |
| 23 | 2 | NULL | NULL | NULL | NULL |
| 24 | 3 | NULL | NULL | NULL | NULL |
| 25 | 2 | NULL | NULL | NULL | NULL |
+----+-----------+------+------+------+------+
24 rows in set (0.00 sec)

View 给了我们这个:

mysql> select * from parent_top_3;
+----+------+------+------+-------+---------+---------+
| id | a | b | c | maxid | maxidm1 | maxidm2 |
+----+------+------+------+-------+---------+---------+
| 1 | 1 | 1 | NULL | NULL | NULL | NULL |
| 2 | 2 | 2 | NULL | 25 | 23 | 22 |
| 3 | 3 | 3 | NULL | 24 | 21 | 18 |
+----+------+------+------+-------+---------+---------+
3 rows in set (0.21 sec)

View 的解释计划只是有点毛茸茸:

mysql> explain select * from parent_top_3;
+----+--------------------+------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+------+---------------+------+---------+------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 3 | |
| 2 | DERIVED | a | ALL | NULL | NULL | NULL | NULL | 3 | |
| 5 | DEPENDENT SUBQUERY | child | ALL | PRIMARY | NULL | NULL | NULL | 24 | Using where |
| 4 | DEPENDENT SUBQUERY | child | ALL | PRIMARY | NULL | NULL | NULL | 24 | Using where |
| 3 | DEPENDENT SUBQUERY | child | ALL | NULL | NULL | NULL | NULL | 24 | Using where |
+----+--------------------+------------+------+---------------+------+---------+------+------+-------------+

但是,如果我们为 parent_fks 添加一个索引,它会变得更好:

mysql> create index pid on child(parent_id);

mysql> explain select * from parent_top_3;
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 3 | |
| 2 | DERIVED | a | ALL | NULL | NULL | NULL | NULL | 3 | |
| 5 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | util.a.id | 2 | Using where |
| 4 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | util.a.id | 2 | Using where |
| 3 | DEPENDENT SUBQUERY | child | ref | pid | pid | 5 | util.a.id | 2 | Using where |
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
5 rows in set (0.04 sec)

如上所述,当父行的数量少于 100 时,这开始崩溃,即使我们使用其主键对父行进行索引:

mysql> select * from parent_top_3 where  id < 10;
+----+------+------+------+-------+---------+---------+
| id | a | b | c | maxid | maxidm1 | maxidm2 |
+----+------+------+------+-------+---------+---------+
| 1 | 1 | 1 | NULL | NULL | NULL | NULL |
| 2 | 2 | 2 | NULL | 25 | 23 | 22 |
| 3 | 3 | 3 | NULL | 24 | 21 | 18 |
| 4 | NULL | 1 | NULL | 65 | 64 | 63 |
| 5 | NULL | 2 | NULL | 73 | 72 | 71 |
| 6 | NULL | 3 | NULL | 113 | 112 | 111 |
| 7 | NULL | 1 | NULL | 209 | 208 | 207 |
| 8 | NULL | 2 | NULL | 401 | 400 | 399 |
| 9 | NULL | 3 | NULL | 785 | 784 | 783 |
+----+------+------+------+-------+---------+---------+
9 rows in set (1 min 3.11 sec)

(注意我特意在慢速机器上测试,数据保存在慢速闪存盘上。)

这里是解释,寻找一个 id(和第一个,在那):

mysql> explain select * from parent_top_3 where id = 1;
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 1000 | Using where |
| 2 | DERIVED | a | ALL | NULL | NULL | NULL | NULL | 1000 | |
| 5 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | util.a.id | 179 | Using where |
| 4 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | util.a.id | 179 | Using where |
| 3 | DEPENDENT SUBQUERY | child | ref | pid | pid | 5 | util.a.id | 179 | Using where |
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
5 rows in set (56.01 sec)

即使在我的慢机器上,一行超过 56 秒也是 Not Acceptable 两个数量级。

那么我们可以保存这个查询吗?它有效,只是太慢了。

这是修改后的查询的解释计划。它看起来很糟糕或更糟:

mysql> explain select * from parent_top_3a where id = 1;
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 100 | Using where |
| 2 | DERIVED | <derived4> | ALL | NULL | NULL | NULL | NULL | 100 | |
| 4 | DERIVED | <derived6> | ALL | NULL | NULL | NULL | NULL | 100 | |
| 6 | DERIVED | a | ALL | NULL | NULL | NULL | NULL | 100 | |
| 7 | DEPENDENT SUBQUERY | child | ref | pid | pid | 5 | util.a.id | 179 | Using where |
| 5 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | a.id | 179 | Using where |
| 3 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | a.id | 179 | Using where |
+----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+
7 rows in set (0.05 sec)

但它在 1/20 秒内完成的速度快了 三个 个数量级!

我们如何到达更快的 parent_top_3a?我们创建三个 View ,每个 View 都依赖于前一个:

create view parent_top_1 as  
select a.*,
(select max(id) from child where parent_id = a.id)
as maxid
from parent a;

create view parent_top_2 as
select a.*,
(select max(id) from child where parent_id = a.id and id < a.maxid)
as maxidm1
from parent_top_1 a;

create view parent_top_3a as
select a.*,
(select max(id) from child where parent_id = a.id and id < a.maxidm1)
as maxidm2
from parent_top_2 a;

这不仅工作得更快,而且在 MySQL 以外的 RDBMS 上也是合法的。

让我们将父行数增加到 12800,子行数增加到 1536(大多数博文没有评论,对吗?;))

mysql> select * from parent_top_3a where id >= 20 and id < 40;
+----+------+------+------+-------+---------+---------+
| id | a | b | c | maxid | maxidm1 | maxidm2 |
+----+------+------+------+-------+---------+---------+
| 39 | NULL | 2 | NULL | NULL | NULL | NULL |
| 38 | NULL | 1 | NULL | NULL | NULL | NULL |
| 37 | NULL | 3 | NULL | NULL | NULL | NULL |
| 36 | NULL | 2 | NULL | NULL | NULL | NULL |
| 35 | NULL | 1 | NULL | NULL | NULL | NULL |
| 34 | NULL | 3 | NULL | NULL | NULL | NULL |
| 33 | NULL | 2 | NULL | NULL | NULL | NULL |
| 32 | NULL | 1 | NULL | NULL | NULL | NULL |
| 31 | NULL | 3 | NULL | NULL | NULL | NULL |
| 30 | NULL | 2 | NULL | 1537 | 1536 | 1535 |
| 29 | NULL | 1 | NULL | 1529 | 1528 | 1527 |
| 28 | NULL | 3 | NULL | 1513 | 1512 | 1511 |
| 27 | NULL | 2 | NULL | 1505 | 1504 | 1503 |
| 26 | NULL | 1 | NULL | 1481 | 1480 | 1479 |
| 25 | NULL | 3 | NULL | 1457 | 1456 | 1455 |
| 24 | NULL | 2 | NULL | 1425 | 1424 | 1423 |
| 23 | NULL | 1 | NULL | 1377 | 1376 | 1375 |
| 22 | NULL | 3 | NULL | 1329 | 1328 | 1327 |
| 21 | NULL | 2 | NULL | 1281 | 1280 | 1279 |
| 20 | NULL | 1 | NULL | 1225 | 1224 | 1223 |
+----+------+------+------+-------+---------+---------+
20 rows in set (1.01 sec)

请注意,这些时间是针对 MyIsam 表的;我会把它留给其他人在 Innodb 上进行计时。


但是使用 Postgresql,在相似但不相同的数据集上,我们在涉及 parent 列的 where 谓词上得到相似的时间:

 postgres=# select (select count(*) from parent) as parent_count, (select count(*) 
from child) as child_count;
parent_count | child_count
--------------+-------------
12289 | 1536

postgres=# select * from parent_top_3a where id >= 20 and id < 40;
id | a | b | c | maxid | maxidm1 | maxidm2
----+---+----+---+-------+---------+---------
20 | | 18 | | 1464 | 1462 | 1461
21 | | 88 | | 1463 | 1460 | 1457
22 | | 72 | | 1488 | 1486 | 1485
23 | | 13 | | 1512 | 1510 | 1509
24 | | 49 | | 1560 | 1558 | 1557
25 | | 92 | | 1559 | 1556 | 1553
26 | | 45 | | 1584 | 1582 | 1581
27 | | 37 | | 1608 | 1606 | 1605
28 | | 96 | | 1607 | 1604 | 1601
29 | | 90 | | 1632 | 1630 | 1629
30 | | 53 | | 1631 | 1628 | 1625
31 | | 57 | | | |
32 | | 64 | | | |
33 | | 79 | | | |
34 | | 37 | | | |
35 | | 60 | | | |
36 | | 75 | | | |
37 | | 34 | | | |
38 | | 87 | | | |
39 | | 43 | | | |
(20 rows)

Time: 91.139 ms

关于sql - 限制非唯一值的返回,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/780236/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com