mysql - 使用自定义计算列改进复杂查询-6ren

mysql - 使用自定义计算列改进复杂查询

转载作者：行者123 更新时间：2023-11-29 16:26:04

我遇到了这个 SQL 问题，希望得到您的支持。

我有以下表结构，分数

| student_id | x1 | x2 | x3 | y1 | y2 | z1 | z2 | z3 | z4 |
| ---------- | -- | -- | -- | -- | -- | -- | -- | -- | -- |
| 1          | 5  | 3  | 1  | 4  | 3  | 3  | 4  | 1  | 2  |
| 2          | 5  | 3  | 3  | 4  | 2  | 1  | 5  | 2  | 3  |
| 3          | 4  | 2  | 2  | 1  | 1  | 3  | 4  | 3  | 4  |
| 4          | 1  | 4  | 5  | 4  | 5  | 3  | 5  | 5  | 4  |

student_id 是PRIMARY_KEY。其他列 x1, x2... 的 TINYINT(1) 范围为 1..5(含)。

目标:

根据给定的 student_id 列表计算给定 student_id 的分数。
结果集应包含两列:student_id(排除给定的一列)和final_score。必须按 final_score DESC 排序。

计算学生 A 与学生 B 的 final_score 的公式。

给定:学生 A 和 B 的两条记录以及不同类别的分数列表。例如:类别 X 有 3 个问题，类别 Y 有 2 个问题，类别 Z 有 4 个问题。
首先，先计算每个类别的平均分。
AVG_X = ( ABS(XA1 - XB1) + ABS(XA2 - XB2) + ABS(XA3 - XB3) )/3
AVG_Y = ( ABS(YA1 - YB1) + ABS(YA2 - YB2) )/2
AVG_Z = ( ABS(ZA1 - ZB1) + ABS(ZA2 - ZB2) + ABS(ZA3 - ZB3) + ABS(ZA4 - ZB4) )/4

其中:AVG是类别的平均值。 ABS就是获取绝对值。

最后，最终得分的计算方式为:
FINAL_SCORE = 5 - ((AVG_X + AVG_Y + AVG_Z)/3)

基于此，我进行了以下 SQL 查询。

SELECT 
    f.student_id, 
    5 - ( avg_cate_x + avg_cate_y + avg_cate_z ) / 3 as final_score
FROM
(
    SELECT
        s.student_id,
        (
            ABS(s.x1 - u.x1) + ABS(s.x2 - u.x2) + ABS(s.x3 - u.x3)
        ) / 3 AS avg_cate_x,
        (
            ABS(s.y1 - u.y1) + ABS(s.y2 - u.y2)
        ) / 2 AS avg_cate_y,
        (
            ABS(s.z1 - u.z1) + ABS(s.z2 - u.z2) + ABS(s.z3 - u.z3) + ABS(s.z4 - u.z4)
        ) / 4 AS avg_cate_z,

    FROM scores AS s

    JOIN
    ( SELECT * FROM scores WHERE scores.student_id = 1 ) AS u
) AS f

ORDER by final_score DESC;

当我执行它来获得 student_id = 1 的最终分数(与具有 50k 记录的表的其余部分相比)时，性能非常慢，需要 970 毫秒。

这是解释

+----+-------------+--------+------------+-------+---------------+---------+---------+-------+------+----------+-----------------------+
| id | select_type | table  | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | scores | NULL       | const | PRIMARY       | PRIMARY | 4       | const |    1 |   100.00 | Using filesort        |
|  1 | SIMPLE      | s      | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL  |    4 |   100.00 | Using index condition |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------+------+----------+-----------------------+

有什么办法可以改进这个查询吗？或者如果您有更好的想法，我真的很感激。

建议

@Muhammad Waheed:使用 INNER JOIN 而不是 JOIN。确实快了 12%。

这里是更新后的查询:

SELECT 
    agged.user_id, 
    5 - (
        avg_cate_06 + avg_cate_07 + avg_cate_08 + 
        avg_cate_09 + avg_cate_10 + avg_cate_11 + 
        avg_cate_12 
    ) / 7 as final_score
FROM
(
    SELECT
        s.user_id,
        (
            ABS(s.q1 - u.q1) + ABS(s.q2 - u.q2) + ABS(s.q3 - u.q3) + ABS(s.q4 - u.q4) + ABS(s.q5 - u.q5) +
            ABS(s.q6 - u.q6) + ABS(s.q7 - u.q7) + ABS(s.q8 - u.q8) + ABS(s.q9 - u.q9) + ABS(s.q10 - u.q10) +
            ABS(s.q11 - u.q11) + ABS(s.q12 - u.q12) + ABS(s.q13 - u.q13) + ABS(s.q14 - u.q14) + ABS(s.q15 - u.q15) +
            ABS(s.q16 - u.q16) + ABS(s.q17 - u.q17) + ABS(s.q18 - u.q18) + ABS(s.q19 - u.q19) + ABS(s.q20 - u.q20) 
        ) / 20 AS avg_cate_06,

        (
            ABS(s.q21 - u.q21) + ABS(s.q22 - u.q22) + ABS(s.q23 - u.q23) + ABS(s.q24 - u.q24) + ABS(s.q25 - u.q25) + 
            ABS(s.q26 - u.q26) + ABS(s.q27 - u.q27) + ABS(s.q28 - u.q28) + ABS(s.q29 - u.q29) + ABS(s.q30 - u.q30) + 
            ABS(s.q31 - u.q31) + ABS(s.q32 - u.q32) + ABS(s.q33 - u.q33) + ABS(s.q34 - u.q34) + ABS(s.q35 - u.q35) + 
            ABS(s.q36 - u.q36) + ABS(s.q37 - u.q37) + ABS(s.q38 - u.q38) + ABS(s.q39 - u.q39) + ABS(s.q40 - u.q40) + 
            ABS(s.q41 - u.q41) + ABS(s.q42 - u.q42) + ABS(s.q43 - u.q43) + ABS(s.q44 - u.q44) + ABS(s.q45 - u.q45) + 
            ABS(s.q46 - u.q46) + ABS(s.q47 - u.q47) + ABS(s.q48 - u.q48) + ABS(s.q49 - u.q49) + ABS(s.q50 - u.q50) + 
            ABS(s.q51 - u.q51) + ABS(s.q52 - u.q52) + ABS(s.q53 - u.q53) + ABS(s.q54 - u.q54) + ABS(s.q55 - u.q55) + 
            ABS(s.q56 - u.q56) + ABS(s.q57 - u.q57) + ABS(s.q58 - u.q58) + ABS(s.q59 - u.q59) + ABS(s.q60 - u.q60) + 
            ABS(s.q61 - u.q61)
        ) / 41 AS avg_cate_07,

        (
            ABS(s.q62 - u.q62) + ABS(s.q63 - u.q63) + ABS(s.q64 - u.q64) + ABS(s.q65 - u.q65) + ABS(s.q66 - u.q66) + 
            ABS(s.q67 - u.q67) + ABS(s.q68 - u.q68) + ABS(s.q69 - u.q69) + ABS(s.q70 - u.q70) + ABS(s.q71 - u.q71) + 
            ABS(s.q72 - u.q72) + ABS(s.q73 - u.q73) + ABS(s.q74 - u.q74) + ABS(s.q75 - u.q75)
        ) / 14 AS avg_cate_08,

        (
            ABS(s.q76 - u.q76) + ABS(s.q77 - u.q77) + ABS(s.q78 - u.q78) + ABS(s.q79 - u.q79) + ABS(s.q80 - u.q80) + 
            ABS(s.q81 - u.q81) + ABS(s.q82 - u.q82) + ABS(s.q83 - u.q83) + ABS(s.q84 - u.q84) + ABS(s.q85 - u.q85) + 
            ABS(s.q86 - u.q86) + ABS(s.q87 - u.q87) + ABS(s.q88 - u.q88) + ABS(s.q89 - u.q89) + ABS(s.q90 - u.q90) + 
            ABS(s.q91 - u.q91) + ABS(s.q92 - u.q92) + ABS(s.q93 - u.q93) + ABS(s.q94 - u.q94) + ABS(s.q95 - u.q95)
        ) / 20 AS avg_cate_09,

        (
            ABS(s.q96 - u.q96)   + ABS(s.q97 - u.q97)   + ABS(s.q98 - u.q98)   + ABS(s.q99 - u.q99)   + ABS(s.q100 - u.q100) + 
            ABS(s.q101 - u.q101) + ABS(s.q102 - u.q102) + ABS(s.q103 - u.q103) + ABS(s.q104 - u.q104) + ABS(s.q105 - u.q105) +
            ABS(s.q106 - u.q106) + ABS(s.q107 - u.q107) + ABS(s.q108 - u.q108) + ABS(s.q109 - u.q109) + ABS(s.q110 - u.q110) + 
            ABS(s.q111 - u.q111) + ABS(s.q112 - u.q112) + ABS(s.q113 - u.q113) + ABS(s.q114 - u.q114) + ABS(s.q115 - u.q115)
        ) / 20 AS avg_cate_10,

        (
            ABS(s.q116 - u.q116) + ABS(s.q117 - u.q117) + ABS(s.q118 - u.q118) + ABS(s.q119 - u.q119) + ABS(s.q120 - u.q120) + 
            ABS(s.q121 - u.q121) + ABS(s.q122 - u.q122) + ABS(s.q123 - u.q123) + ABS(s.q124 - u.q124) + ABS(s.q125 - u.q125) + 
            ABS(s.q126 - u.q126) + ABS(s.q127 - u.q127)
        ) / 12 AS avg_cate_11,

        (
            ABS(s.q128 - u.q128) + ABS(s.q129 - u.q129) + ABS(s.q130 - u.q130) + ABS(s.q131 - u.q131) + ABS(s.q132 - u.q132) + 
            ABS(s.q133 - u.q133) + ABS(s.q134 - u.q134) + ABS(s.q135 - u.q135) + ABS(s.q136 - u.q136) + ABS(s.q137 - u.q137) + 
            ABS(s.q138 - u.q138) + ABS(s.q139 - u.q139) + ABS(s.q140 - u.q140) + ABS(s.q141 - u.q141) + ABS(s.q142 - u.q142) + 
            ABS(s.q143 - u.q143) + ABS(s.q144 - u.q144) + ABS(s.q145 - u.q145) + ABS(s.q146 - u.q146) + ABS(s.q147 - u.q147) + 
            ABS(s.q148 - u.q148) + ABS(s.q149 - u.q149) + ABS(s.q150 - u.q150) + ABS(s.q151 - u.q151) + ABS(s.q152 - u.q152) + 
            ABS(s.q153 - u.q153) + ABS(s.q154 - u.q154) + ABS(s.q155 - u.q155) + ABS(s.q156 - u.q156) + ABS(s.q157 - u.q157)
        ) / 30 AS avg_cate_12

    FROM scores AS s

    INNER JOIN
        scores AS u ON u.user_id = 1 

) AS agged


ORDER by final_score DESC;

执行计划变为:

+----+-------------+-------+------------+-------+---------------+---------+---------+-------+-------+----------+----------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref   | rows  | filtered | Extra          |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+-------+----------+----------------+
|  1 | SIMPLE      | u     | NULL       | const | PRIMARY       | PRIMARY | 4       | const |     1 |   100.00 | Using filesort |
|  1 | SIMPLE      | s     | NULL       | ALL   | NULL          | NULL    | NULL    | NULL  | 49999 |   100.00 | NULL           |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+-------+----------+----------------+

查询成本为:

mysql> SHOW STATUS LIKE 'Last_query_cost';
+-----------------+-------------+
| Variable_name   | Value       |
+-----------------+-------------+
| Last_query_cost | 5494.773878 |
+-----------------+-------------+

谢谢。

最佳答案

MySQL 实现子查询。因此，您可以尝试在不使用子查询的情况下编写此代码:

SELECT s.student_id,
       (ABS(s.x1 - u.x1) + ABS(s.x2 - u.x2) + ABS(s.x3 - u.x3)
       ) / 3 AS avg_cate_x,
       (ABS(s.y1 - u.y1) + ABS(s.y2 - u.y2)
       ) / 2 AS avg_cate_y,
       (ABS(s.z1 - u.z1) + ABS(s.z2 - u.z2) + ABS(s.z3 - u.z3) + ABS(s.z4 - u.z4)
       ) / 4 AS avg_cate_z,
       (5 - 
        (ABS(s.x1 - u.x1) + ABS(s.x2 - u.x2) + ABS(s.x3 - u.x3)
        ) / 3 +
        (ABS(s.y1 - u.y1) + ABS(s.y2 - u.y2)
        ) / 2 +
        (ABS(s.z1 - u.z1) + ABS(s.z2 - u.z2) + ABS(s.z3 - u.z3) + ABS(s.z4 - u.z4)
        ) / 4
       ) / 3 as final_score
FROM scores s JOIN
     scores u
     ON u.student_id = 1
ORDER by final_score DESC;

这不如使用子查询优雅，但您可能会看到性能改进。

此外，scores(student_id) 上的索引也会有所帮助。

关于mysql - 使用自定义计算列改进复杂查询，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54229050/

文章推荐： javascript - 如何使用 Backbone.js 处理嵌套 View ？

文章推荐： javascript - Javascript 异步性的背后是什么 - 事件驱动系统

mysql - SUM(COUNT(列)*AVG(列)) AS 列
我正在 csv 上使用 hadoop 来分析一些数据。我使用sql/mysql(不确定)来分析数据，现在陷入了僵局。我花了好几个小时在谷歌上搜索，却没有找到任何相关的东西。我需要一个查询，在该查询中
html - BOOTSTRAP 网格 | 4 列 > 2 列 > 1 列
我正在为 Bootstrap 网格布局的“简单”任务而苦苦挣扎。我希望在大视口(viewport)上有 4 列，然后在中型设备上有 2 列，最后在较小的设备上只有 1 列。当我测试我的代码片段时，似
mysql - 仅选择具有重复(A 列 || B 列)但不同(C 列)值的记录
对于这个令人困惑的标题，我深表歉意，我想不出这个问题的正确措辞。相反，我只会给你背景信息和目标: 这是在一个表中，一个人可能有也可能没有多行数据，这些行可能包含相同的 activity_id 值，也可
sequelize.js - 如何使用 Sequelize 结果查找 A 列 > B 列 + C 列
具有 3 列的数据库表 - A int , B int , C int 我的问题是: 如何使用 Sequelize 结果找到 A > B + C const countTasks = await Ta
MySQL 选择 DISTINCT 列 1、列 2From 表 order by 列 2 DESC
我在通过以下功能编写此查询时遇到问题: 首先按第 2 列 DESC 排序，然后从“不同的第 1 列”中选择只有 Column1 是 DISTINCT 此查询没有帮助，因为它首先从第 1 列中进行选择
css - 使用 bootstrap，台式机中有 4 列，平板电脑中有 2 列，移动设备中有 1 列
使用 Bootstrap 非常有趣和有帮助，目前我在创建以下需求时遇到问题。 “使用 bootstrap 在桌面上有 4 列，在平板电脑上有 2 列，在移动设备上有 1 列”谁能告诉我正确的结构最佳
r - 比较第 1 列(第 1 列)中的连续值并使用第 1 列后比较结果创建新列(第 2 列)
我是 R 新手，正在问一个非常基本的问题。当然，我在尝试从所提供的示例中获取指导的同时做了功课here和 here ，但无法在我的案例中实现这个想法，即可能是由于我的问题中的比较维度更大。我的实
python - 如果文件 1 中的 A 列 = 文件 2 中的 A 列，则替换为文件 2 中的 B 列
通常我会使用 R 并执行 merge.by，但这个文件似乎太大了，部门中的任何一台计算机都无法处理它! (任何从事遗传学工作的人的附加信息)本质上，插补似乎删除了 snp ID 的 rs 数字，我只剩
python - 当第 1 列 > 0 且第 2 列 <= 0 时，如何将第 1 列的值分配给第 2 列
我有一个 df , delta1 delta2 0 -1 2 0 -1 0 0 0 我想知道如何分配 delt
MySQL 加入 ON 列 a IN(列 b)
您好，我想知道是否可以执行以下操作。显然，我已经尝试在 phpMyAdmin 中运行它，但出现错误。也许还有另一种方式来编写此查询。 SELECT * FROM eat_eat_restaurants
python - 如何将 listA 列 1 值匹配并替换为与 ListB 列 1 匹配的 ListB 列 2 值，就像我们在 vlookup 中所做的那样
我有 2 个列表(标题和数据值)。我想要将数据值列 1 匹配并替换为头文件列 1，以获得与 dataValue 列 1 和标题值列 2 匹配的值头文件 TotalLoad,M0001001 Hois
linux - 如果 file1 的 B 列 = file2 的 B 列，则将 file1 的 A 列替换为 file2 的 A 列
我有两个不同长度的文件，file2 是一个很大的引用文件，我从中提取文件 1 的数据。我有一行 awk，我通常会对其进行调整以在我的文件中进行查找和替换，但它总是在同一列中进行查找和替换。所以对于
sql - 检查一个表(列)中的日期是否适合另一个表(列)中的另一个日期
假设我有两个表，如下所示。 create table contract( c_ID number(1) primary key, c_name varchar2(50) not
java - 无法将减号插入具有检查约束的 varchar 列(列 <> '')
我有一个带有 varchar 列的 H2 表，其检查约束定义如下: CONSTRAINT my_constraint CHECK (varchar_field <> '') 以下插入语句失败，但当我删
CSS 3 列，为什么第三列接管了其他 2 列？
这是最少量的代码，可以清楚地说明我的问题: One Two Three 前 2 个 div 应该是 2 个左列。第三个应该占据页面的其余部分。最后，我将添加选项来隐藏和
azure - 该查询未返回 TimeGenerate 列。请编辑查询并包含 TimeGenerate 列
在 Azure 中的 Log Analytics 中，我为 VM Heartbeat 选择一个预定义查询，我在编辑器中运行查询正常，但当我去创建警报时，我不断收到警报“查询未返回 TimeGenera
azure - 该查询未返回 TimeGenerate 列。请编辑查询并包含 TimeGenerate 列
在 Azure 中的 Log Analytics 中，我为 VM Heartbeat 选择一个预定义查询，我在编辑器中运行查询正常，但当我去创建警报时，我不断收到警报“查询未返回 TimeGenera
java - 为什么 [列,行] 而不是 [行,列]
今天我开始使用 JexcelApi 并遇到了这个:当您尝试从特定位置获取元素时，不是像您通常期望的那样使用sheet.getCell(row,col)，而是使用sheet.getCell(col,ro
C# 显示数据库的 2 列，共有 28 列
我有一个包含 28 列的数据库。第一列是代码，第二列是名称，其余是值。 public void displayData() { con.Open(); MySqlDataAdapter
html - 我的网页是 2 列，但在放大时重叠成 1 列？
我很沮丧:每当我缩小这个网页时，一切都变得一团糟。我如何将网页居中，以便我可以缩小并且元素不会被错误定位。 (它应该是 2 列，但所有内容都合并为 1)我试过但由于某种原因，这不起作用。 www.o

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

mysql - 使用自定义计算列改进复杂查询