Postgresql 数据在一张表中存在,另一张表不满足完整性的查找
发布时间:2020-12-13 17:23:30 所属栏目:百科 来源:网络整理
导读:有两张表T1,T2,表结构和数据如下: create table t1 ( id int);create table t2 ( id int,finished int);insert into t1 values (1);insert into t1 values (2);insert into t1 values (3);insert into t1 values (4);insert into t1 values (5);insert int
有两张表T1,T2,表结构和数据如下:
create table t1 ( id int ); create table t2 ( id int,finished int ); insert into t1 values (1); insert into t1 values (2); insert into t1 values (3); insert into t1 values (4); insert into t1 values (5); insert into t1 values (6); insert into t2 values (1,1); insert into t2 values (2,0); insert into t2 values (3,1); insert into t2 values (4,0);SQL Fiddle 可以测试SQL语句的执行。 想要实现T1中存在前提下,T2中不存在或者finished=0,也就是查询结果是:2,4,5,6. 一、性能测试:1. not in:explain (analyze,verbose,costs,buffers) select ID from T1 where ID not in (select ID from T2 where finished=1);Total runtime: 0.128 ms 2. not exists: explain (analyze,buffers) select ID from T1 where not exists (select 1 from T2 where T1.ID=T2.ID and T2.finished=1);Total runtime: 0.105 ms 3. left join: explain (analyze,buffers) select T1.ID from T1 left join T2 on T1.ID=T2.ID and T2.finished=1 where T2.ID is null;Total runtime: 0.096 ms 4. 网上还看到一种更快方法,但测试下来此方法不对,所以不讨论: select ID from T2 where (select count(1) from T1 where T1.ID=T2.ID) = 0; 这条语句查询结果为空 因此 在postgresql 9.3 上语句执行速度 left join > not exists > not in 当T1和T2表中ID出现null时,not in 语句会有不同的表现,所以推荐总是用not exists 代替 not in. 二、大数据量性能测试:在大量数据的时候,not in有严重性能下降的问题,下面是我在i5 2.4GHz MAC pro 13吋上的测试。department(T1) 为59280条数据,数据长度29字符;dept(T2) 为23633条数据,数据长度29字符。 1. explain analyze select department.id from department where department.id not in (select id from dept where finished=true); Total runtime: 447073.065 ms 2. explain analyze select department.id from department where not exists (select 1 from dept where department.id=dept.id and finished=true); Total runtime: 325.732 ms 3. explain analyze select department.id from department left join dept on department.id=dept.id and dept.finished=true where dept.id is null; Total runtime: 319.869 ms 三、总结:在Postgresql 9.3上:not in 不仅性能差,而且逻辑可能有问题。 not exists 性能不错,思考起来比较容易。 left join 性能最好,但总体跟not exists 比也快不了多少,思考稍微有点绕。 下面是一张网上的left join 的图,但找不到出处,有助于理解 left join 的过程:
(编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |