利用SQL Server 2005的新功能NOW_NUMBER和CTE可以很好的实现. 举例说明如下: 建立测试数据: <div class="codetitle"><a style="CURSOR: pointer" data="42033" class="copybut" id="copybut42033" onclick="doCopy('code42033')"> 代码如下:<div class="codebody" id="code42033"> create table Dup1 ( Col1 int null, Col2 varchar(20) null ) insert into Dup1 values (1,'aaa'), (2, (3,'bbb'), (4,'ccc'),'ddd'), (5,'eee') select from Dup1 可以查看到重复的数据有: <div class="codetitle"><a style="CURSOR: pointer" data="33747" class="copybut" id="copybut33747" onclick="doCopy('code33747')"> 代码如下:<div class="codebody" id="code33747"> SELECT Col1,Col2,COUNT() AS DupCountFROM Dup1GROUP BY Col1,Col2HAVING COUNT() > 1
<IMG src="http:https://files.52php.cn/upload/201006/20100608002211701.jpg">
接下来介绍如何delete掉重复的数据:
1.NOW_NUMBER:SQL Server 2005添加了很好用的RANKING函数(NOW_NUMBER,RANK,DENSE_RANK,NTILE),利用NOW_NUMBER()OVER(PARTITION GY)最为直接,也最为方便,不能修改表或者产生多余的列. 首先会分配一个列号码,以Col1,Col2组合来分区排序. <div class="codetitle"><a style="CURSOR: pointer" data="11720" class="copybut" id="copybut11720" onclick="doCopy('code11720')"> 代码如下:<div class="codebody" id="code11720"> SELECT Col1,ROW_NUMBER() OVER (PARTITION BY Col1,Col2 ORDER BY Col1) AS rnFROM Dup1 得到的序号如下: <IMG src="http:https://files.52php.cn/upload/201006/20100608002211200.jpg"> 很明显的是重复列都分组分割排序,只需要delete掉排序序号>1的即可. <div class="codetitle"><a style="CURSOR: pointer" data="51241" class="copybut" id="copybut51241" onclick="doCopy('code51241')"> 代码如下:<div class="codebody" id="code51241"> --用到CTE WITH DupsD AS ( SELECT Col1, ROW_NUMBER() OVER (PARTITION BY Col1,Col2 ORDER BY Col1) AS rn FROM Dup1 ) DELETE DupsD WHERE rn > 1; --或者 DELETE A FROM ( SELECT Col1,Col2 ORDER BY Col1) AS rn FROM Dup1) A WHERE A.rn>1 2.创建一个标识键唯一的表记一列. <div class="codetitle"><a style="CURSOR: pointer" data="32678" class="copybut" id="copybut32678" onclick="doCopy('code32678')"> 代码如下:<div class="codebody" id="code32678"> ALTER TABLE dbo.Dup1 ADD PK INT IDENTITY NOT NULL CONSTRAINT PK_Dup1 PRIMARY KEY; SELECT FROM Dup1; 删除找出与Col1,Col2相同并且比Dup1.PK大的记录,也就是保留重复值中PK最小的记录. <div class="codetitle"><a style="CURSOR: pointer" data="14925" class="copybut" id="copybut14925" onclick="doCopy('code14925')"> 代码如下:<div class="codebody" id="code14925"> DELETE Dup1 WHERE EXISTS ( SELECT FROM Dup1 AS D1 WHERE D1.Col1 = Dup1.Col1 AND D1.Col2 = Dup1.Col2 AND D1.PK > Dup1.PK ); 3.select distant into,这种方法借助一个新的table,把不重复的结果集转移到新table中. <div class="codetitle"><a style="CURSOR: pointer" data="41628" class="copybut" id="copybut41628" onclick="doCopy('code41628')"> 代码如下:<div class="codebody" id="code41628"> SELECT distinct Col1,Col2 INTO NoDupsFROM Dup1;select from NoDups 建议采用第一种和第三种方法,第一种多见于T-SQL的编程中,第三种在ETL中常常使用. (编辑:李大同)
【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!
|