gpt4 book ai didi

MySQL 不使用主键删除重复项

转载 作者:行者123 更新时间:2023-11-29 22:36:16 25 4
gpt4 key购买 nike

我在 MySQL 数据库中有这个表。

1   test1.csv   Jan     Thomas      Sales       5000
2 test1.csv Jan Michael Sales 200
3 test1.csv Thomas John Technology 12900
4 test2.csv Robert James Technology 5500
5 test2.csv Robert Albertson Technology 6000
6 test2.csv Mark Jeffries Technology 900
7 test2.csv Ted James Technology 10000
8 test2.csv Mayla Arthurs Technology 7000
9 test2.csv Mayla Smith Technology 9500
10 test3.csv Mayla Anthony Technology 3000
11 test3.csv Mayla Mark Technology 3000
12 test4.csv Mayla Roberts Technology 8500
13 test4.csv Anthony Anderson Marketing 9500
14 test5.csv Anthony Smith Technology 6000
15 test5.csv Jan Thomas Sales 5000
16 test5.csv Jan Michael Sales 200
17 test5.csv Thomas John Technology 12900
18 test1.csv Jan Michael Sales 8000
19 test1.csv Thomas John Technology 1540
20 test2.csv Mayla Smith Technology 10500
21 test3.csv Mayla Anthony Technology 5600
22 test4.csv Anthony Anderson Marketing 2500
23 test5.csv Brian Earl HR 1200
24 test5.csv John Smith HR_Sales 2000
25 test6.csv Jan Thomas HR_Sales 12000
26 test6.csv Jan Michael Education 1500
27 test7.csv Thomas John HR_Sales 1000

创建表的 SQL 代码位于本文末尾。每条记录由文件名、名字、姓氏、部门、工资组成。有时,相同的记录存在于多个文件中 - 我不能拥有这些重复的记录。

正如你所看到的:id = 15、16、17 分别是 id = 1、2、3 的重复项。

我需要删除文件名不同但记录相同的重复项。

其他信息:

  1. 我无法使用DELETE FROM employee WHERE id IN (15, 16, 17)因为我不知道哪些行会重复。
  2. 该表不断更新,添加更多 *.csv文件到它。这意味着,如果我创建一个新的索引列,那么我无法附加 *.csv包含数据库中已有记录的重复项的文件。因此,我无法使用索引列或 GROUP BY() .

有没有办法在不使用 PK 列的情况下删除重复的行?

创建上表的 SQL 代码:

CREATE SCHEMA dupl_test;

USE dupl_test;

create table employee (
id INT AUTO_INCREMENT PRIMARY KEY,
filename varchar(20),
firstname varchar(20),
lastname varchar(20),
dept varchar(10),
salary int(10)
);

insert into employee values(1,'test1.csv','Jan','Thomas','Sales',5000);
insert into employee values(2,'test1.csv','Jan','Michael','Sales',200);
insert into employee values(3,'test1.csv','Thomas','John','Technology',12900);
insert into employee values(4,'test2.csv','Robert','James','Technology',5500);
insert into employee values(5,'test2.csv','Robert','Albertson','Technology',6000);
insert into employee values(6,'test2.csv','Mark','Jeffries','Technology',900);
insert into employee values(7,'test2.csv','Ted','James','Technology',10000);
insert into employee values(8,'test2.csv','Mayla','Arthurs','Technology',7000);
insert into employee values(9,'test2.csv','Mayla','Smith','Technology',9500);
insert into employee values(10,'test3.csv','Mayla','Anthony','Technology',3000);
insert into employee values(11,'test3.csv','Mayla','Mark','Technology',3000);
insert into employee values(12,'test4.csv','Mayla','Roberts','Technology',8500);
insert into employee values(13,'test4.csv','Anthony', 'Anderson','Marketing',9500);
insert into employee values(14,'test5.csv','Anthony','Smith','Technology',6000);
insert into employee values(15,'test5.csv','Jan','Thomas','Sales',5000);
insert into employee values(16,'test5.csv','Jan','Michael','Sales',200);
insert into employee values(17,'test5.csv','Thomas','John','Technology',12900);
insert into employee values(18,'test1.csv','Jan','Michael','Sales',8000);
insert into employee values(19,'test1.csv','Thomas','John','Technology',1540);
insert into employee values(20,'test2.csv','Mayla','Smith','Technology',10500);
insert into employee values(21,'test3.csv','Mayla','Anthony','Technology',5600);
insert into employee values(22,'test4.csv','Anthony', 'Anderson','Marketing',2500);
insert into employee values(23,'test5.csv','Brian','Earl','HR',1200);
insert into employee values(24,'test5.csv','John','Smith','HR_Sales',2000);
insert into employee values(25,'test6.csv','Jan','Thomas','HR_Sales',12000);
insert into employee values(26,'test6.csv','Jan','Michael','Education',1500);
insert into employee values(27,'test7.csv','Thomas','John','HR_Sales',1000);

最佳答案

您可以在 MySQL 中使用 deletejoin 来删除重复项:

delete e
from employee e left join
(select firstname, lastname, dept, salary, min(filename) as filename
from employee e
group by firstname, lastname, dept, salary
) tokeep
on e.firstname = tokeep.firstname and e.lastname = tokeep.lastname and
e.dept = tokeep.dept and e.salary = tokeep.salary and
tokeep.filename = e.filename
where tokeep.filename is null;

关于MySQL 不使用主键删除重复项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29527787/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com