Three data table de duplication methods of MSSQL SQL Server

Abstract:

Three different data De duplication methods will be shared below
Data De duplication: it needs to be defined according to a certain field. When there is more than one line of records in this field, we define that there are duplicates in this line of data.


Data De duplication method 1:

When the maximum SN is in the table, we can get a unique value for each duplicate record through association

Data De duplication method 2:
For the records in the table, group according to the specified fields, obtain the maximum serial number, and then perform the de duplication operation
 
Data De duplication method 3:
After grouping, the ranking in the duplicate data group is used. If the ranking is greater than 1, it means duplicate data row data
 
Comparison of efficiency of three methods of weight removal:
Method 3 > method 2 > method 1
 

create table test(keyId int identity,sort varchar(10),
info varchar(20))
go

---Method 1 
truncate table test ;

insert into test(sort,info)values('A','maomao365.com')--1
insert into test(sort,info)values('A','Cat house') --2
insert into test(sort,info)values('B','mssql_blog') --3
insert into test(sort,info)values('B','First-class sql-blog') --4
insert into test(sort,info)values('B','maomao365') --5
insert into test(sort,info)values('C','sql optimization blog') --6
go

delete from test where test.keyId = (select max(b.keyId) from test b where test.sort=b.sort);
select * from test 
---Method 2:
truncate table test ; 
insert into test(sort,info)values('A','maomao365.com')
insert into test(sort,info)values('A','Cat house')
insert into test(sort,info)values('B','mssql_blog')
insert into test(sort,info)values('B','First-class sql-blog')
insert into test(sort,info)values('B','maomao365')
insert into test(sort,info)values('C','sql optimization blog')
go
delete from test 
where keyid not in(select min(keyId) from test group by sort having count(sort)>=1);
select * from test 
---Method 3:
truncate table test ; 
insert into test(sort,info)values('A','maomao365.com')
insert into test(sort,info)values('A','Cat house')
insert into test(sort,info)values('B','mssql_blog')
insert into test(sort,info)values('B','First-class sql-blog')
insert into test(sort,info)values('B','maomao365')
insert into test(sort,info)values('C','sql optimization blog')
go
delete A2 from (
select row_Number() over(partition by sort order by keyid) as keyId_e,* from test 
) as A2 where A2.keyId_e >1

select * from test 
go
drop table test 

 

< img SRC = "http://www.maomao365.com/wp-content/uploads/2018/07/mssql ﹣ SQL Server ﹣ three methods of data table de duplication share. PNG" ALT = "MSSQL ﹣ SQL Server ﹣ three methods of data table de duplication share" width = "813" height = "749" class = "size full wp-image-6767" / >

 

From: http://www.maomao365.com/?p=6766

Tags: SQL Sever SQL

Posted on Mon, 02 Dec 2019 01:35:04 -0500 by PandaFi