gpt4 book ai didi

mysql - 从每组中选择 N 个值

转载 作者:行者123 更新时间:2023-11-29 16:01:46 24 4
gpt4 key购买 nike

我有 2 个表,必须连接它们才能获取所需的数据。

Table1是文章,Table2用于 channel /类别。

为了简单起见,两个表都只有 2 个字段。

基本上,我需要选择N个唯一的ID每个组的 s ( term_id )。值不应在组之间重复。

我可以通过RANK达到预期的结果和PARTITION但它只适用于 MySQL 8.x,而我需要它在 5.7 中工作。

摆弄表结构,一些示例数据和当前查询是 here

这种方法如何适应MySQL 5.7?

SET GLOBAL sql_mode=(SELECT REPLACE(@@sql_mode,'ONLY_FULL_GROUP_BY',''));
CREATE TABLE `Table1` (
`ID` BIGINT(20) UNSIGNED NOT NULL,
`date` DATETIME NOT NULL);
CREATE TABLE `Table2` (
`object_id` BIGINT(20) UNSIGNED NOT NULL DEFAULT '0',
`term_id` BIGINT(20) UNSIGNED NOT NULL DEFAULT '0'
);

INSERT INTO `Table1` (`ID`, `date`) VALUES
(195512, '2019-02-27 12:28:18'),
(195260, '2019-01-02 15:26:45'),
(195245, '2019-01-02 14:22:57'),
(192628, '2018-12-20 13:44:39'),
(192563, '2018-12-20 10:53:32'),
(191224, '2018-12-14 15:32:22'),
(189468, '2018-12-07 10:25:31'),
(188520, '2018-12-03 14:59:59'),
(187808, '2018-11-30 14:03:27'),
(185682, '2018-11-20 21:20:18'),
(195473, '2019-01-03 10:52:01'),
(192982, '2018-12-21 19:55:42'),
(192695, '2018-12-20 15:22:41'),
(192611, '2018-12-20 13:19:29'),
(192320, '2018-12-19 14:06:06'),
(192026, '2018-12-18 12:16:22'),
(191218, '2018-12-14 14:37:57'),
(190677, '2018-12-12 13:22:07'),
(190661, '2018-12-12 13:08:27'),
(190147, '2018-12-10 12:43:42'),
(195260, '2019-01-02 15:26:45'),
(192676, '2018-12-20 15:30:08'),
(192409, '2018-12-19 15:44:59'),
(192057, '2018-12-18 14:46:46'),
(191825, '2018-12-17 15:21:13'),
(190968, '2018-12-13 15:59:28'),
(190701, '2018-12-12 15:21:18'),
(190450, '2018-12-11 16:03:07'),
(190188, '2018-12-10 15:08:08'),
(189484, '2018-12-07 15:43:13'),
(195295, '2019-01-02 15:23:54'),
(192876, '2018-12-21 13:25:22'),
(192537, '2018-12-20 11:59:50'),
(192359, '2018-12-19 13:48:19'),
(192050, '2018-12-18 13:18:03'),
(192059, '2018-12-18 13:03:10'),
(192051, '2018-12-18 13:00:17'),
(191581, '2018-12-17 12:00:12'),
(191260, '2018-12-14 15:55:44'),
(190729, '2018-12-12 15:12:36');

INSERT INTO `Table2` (`object_id`, `term_id`) VALUES
(195295, 568),
(192876, 568),
(192537, 568),
(192359, 568),
(192050, 568),
(192059, 568),
(192051, 568),
(191581, 568),
(191260, 568),
(190729, 568),
(190387, 568),
(190349, 568),
(189122, 568),
(188751, 568),
(187831, 568),
(187273, 568),
(187020, 568),
(187025, 568),
(185320, 568),
(185283, 568),
(195512, 1) ,
(195260, 1) ,
(195245, 1) ,
(192628, 1) ,
(192563, 1) ,
(191224, 1) ,
(189468, 1) ,
(188520, 1) ,
(187808, 1) ,
(185682, 1) ,
(183886, 1) ,
(182668, 1) ,
(182566, 1) ,
(182194, 1) ,
(180177, 1) ,
(179738, 1) ,
(179181, 1) ,
(176889, 1) ,
(176862, 1) ,
(175258, 1) ,
(195473, 564),
(192982, 564),
(192695, 564),
(192611, 564),
(192320, 564),
(192026, 564),
(191218, 564),
(190677, 564),
(190661, 564),
(190147, 564),
(189468, 564),
(189190, 564),
(189159, 564),
(189062, 564),
(188732, 564),
(188688, 564),
(188666, 564),
(188609, 564),
(188611, 564),
(188613, 564),
(195260, 91170),
(192676, 91170),
(192409, 91170),
(192057, 91170),
(191825, 91170),
(190968, 91170),
(190701, 91170),
(190450, 91170),
(190188, 91170),
(189484, 91170),
(189224, 91170),
(189011, 91170),
(188716, 91170),
(188522, 91170),
(187585, 91170),
(187297, 91170),
(187094, 91170),
(186788, 91170),
(185769, 91170),
(185577, 91170);

SELECT `ID`, `term_id`, `date`, `rnk`
FROM
(
SELECT DISTINCT(`ID`), `term_id`, `date`, RANK() OVER (PARTITION BY `channels`.`term_id` ORDER BY FIELD(`channels`.`term_id`, 1, 564, 91170, 568 ), `main`.`date` DESC) AS `rnk`
FROM `Table1` AS `main`
INNER JOIN `Table2` AS `channels` ON(`channels`.`object_id` = `main`.`ID`) AND `channels`.`term_id` IN (1, 564, 91170, 568)
GROUP BY (`ID`)) AS x
WHERE `rnk` <= 3
| ID | term_id | date | rnk |
| ------ | ------- | ------------------- | --- |
| 195512 | 1 | 2019-02-27 12:28:18 | 1 |
| 195260 | 1 | 2019-01-02 15:26:45 | 2 |
| 195245 | 1 | 2019-01-02 14:22:57 | 3 |
| 195473 | 564 | 2019-01-03 10:52:01 | 1 |
| 192982 | 564 | 2018-12-21 19:55:42 | 2 |
| 192695 | 564 | 2018-12-20 15:22:41 | 3 |
| 195295 | 568 | 2019-01-02 15:23:54 | 1 |
| 192876 | 568 | 2018-12-21 13:25:22 | 2 |
| 192537 | 568 | 2018-12-20 11:59:50 | 3 |
| 192676 | 91170 | 2018-12-20 15:30:08 | 1 |
| 192409 | 91170 | 2018-12-19 15:44:59 | 2 |
| 192057 | 91170 | 2018-12-18 14:46:46 | 3 |

最佳答案

仅适用于 8.0 之前的版本...

SET @prev = null;
SET @i = 0;

SELECT id
, term_id
, date
, i
FROM
(SELECT x.id
, y.term_id
, x.date
, CASE WHEN @prev = term_id THEN @i:=@i+1 ELSE @i:=1 END i
, @prev:=term_id
FROM (SELECT DISTINCT id, date FROM table1) x
JOIN table2 y
ON y.object_id = x.ID
WHERE y.term_id IN (1, 564, 91170, 568)
ORDER
BY y.term_id
, x.date DESC
) n
WHERE i<=3
ORDER
BY term_id,i;

+--------+---------+---------------------+------+
| id | term_id | date | i |
+--------+---------+---------------------+------+
| 195512 | 1 | 2019-02-27 12:28:18 | 1 |
| 195260 | 1 | 2019-01-02 15:26:45 | 2 |
| 195245 | 1 | 2019-01-02 14:22:57 | 3 |
| 195473 | 564 | 2019-01-03 10:52:01 | 1 |
| 192982 | 564 | 2018-12-21 19:55:42 | 2 |
| 192695 | 564 | 2018-12-20 15:22:41 | 3 |
| 195295 | 568 | 2019-01-02 15:23:54 | 1 |
| 192876 | 568 | 2018-12-21 13:25:22 | 2 |
| 192537 | 568 | 2018-12-20 11:59:50 | 3 |
| 195260 | 91170 | 2019-01-02 15:26:45 | 1 |
| 192676 | 91170 | 2018-12-20 15:30:08 | 2 |
| 192409 | 91170 | 2018-12-19 15:44:59 | 3 |
+--------+---------+---------------------+------+

请注意,示例数据集包含重复数据。这在结构良好的 RDBMS 中是矛盾的。因此,您可能想要解决这个问题。

关于mysql - 从每组中选择 N 个值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56133477/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com