postgresql – postgres查询优化问题
发布时间:2020-12-13 15:53:51 所属栏目:百科 来源:网络整理
导读:下面是两个几乎相同的postgres查询,但产生了截然不同的查询计划和执行时间.我假设第一个查询很快,因为form_id =’W40’只有196个form_instance记录,而form_id =’W30L’则有7000个.但是为什么从200个记录跳到7000个记录(这对我来说似乎相对较小)导致查询时间
下面是两个几乎相同的postgres查询,但产生了截然不同的查询计划和执行时间.我假设第一个查询很快,因为form_id =’W40’只有196个form_instance记录,而form_id =’W30L’则有7000个.但是为什么从200个记录跳到7000个记录(这对我来说似乎相对较小)导致查询时间如此惊人地增加?我试图以各种方式对数据进行索引以加快速度,但基本上是陷入困境.我怎样才能加快速度呢? (请注意,两个表的模式都包含在底部).
explain analyze select form_id,form_instance_id,answer,field_id from form_instances,field_instances where workflow_state = 'DRqueued' and form_instance_id = form_instances.id and field_id in ('Book_EstimatedDueDate','H_SubmittedDate','H_Ccode','miscarriage','miscarriage_of_multiple','stillbirth','AP_IUFD_of_multiple','maternal_death','birth_includes_transport','newborn_death','H_Pid','H_Mid1','H_Mid2','H_Mid3') and (form_id = 'W40'); QUERY PLAN Nested Loop (cost=0.00..70736.14 rows=4646 width=29) (actual time=0.000..20.000 rows=2399 loops=1) -> Index Scan using form_id_and_workflow_state on form_instances (cost=0.00..1041.42 rows=507 width=8) (actual time=0.000..0.000 rows=196 loops=1) Index Cond: (((form_id)::text = 'W40'::text) AND ((workflow_state)::text = 'DRqueued'::text)) -> Index Scan using index_field_instances_on_form_instance_id on field_instances (cost=0.00..137.25 rows=17 width=25) (actual time=0.000..0.102 rows=12 loops=196) Index Cond: (field_instances.form_instance_id = form_instances.id) Filter: ((field_instances.field_id)::text = ANY ('{Book_EstimatedDueDate,H_SubmittedDate,H_Ccode,miscarriage,miscarriage_of_multiple,stillbirth,AP_IUFD_of_multiple,maternal_death,birth_includes_transport,newborn_death,H_Pid,H_Mid1,H_Mid2,H_Mid3}'::text[])) Total runtime: 30.000 ms (7 rows) explain analyze select form_id,field_id from form_instances,'H_Mid3') and (form_id = 'W30L'); QUERY PLAN Hash Join (cost=34300.46..160865.40 rows=31045 width=29) (actual time=65670.000..74960.000 rows=102777 loops=1) Hash Cond: (field_instances.form_instance_id = form_instances.id) -> Bitmap Heap Scan on field_instances (cost=29232.57..152163.82 rows=531718 width=25) (actual time=64660.000..72800.000 rows=526842 loops=1) Recheck Cond: ((field_id)::text = ANY ('{Book_EstimatedDueDate,H_Mid3}'::text[])) -> Bitmap Index Scan on index_field_instances_on_field_id (cost=0.00..29099.64 rows=531718 width=0) (actual time=64630.000..64630.000 rows=594515 loops=1) Index Cond: ((field_id)::text = ANY ('{Book_EstimatedDueDate,H_Mid3}'::text[])) -> Hash (cost=5025.54..5025.54 rows=3388 width=8) (actual time=980.000..980.000 rows=10457 loops=1) -> Bitmap Heap Scan on form_instances (cost=90.99..5025.54 rows=3388 width=8) (actual time=10.000..950.000 rows=10457 loops=1) Recheck Cond: (((form_id)::text = 'W30L'::text) AND ((workflow_state)::text = 'DRqueued'::text)) -> Bitmap Index Scan on form_id_and_workflow_state (cost=0.00..90.14 rows=3388 width=0) (actual time=0.000..0.000 rows=10457 loops=1) Index Cond: (((form_id)::text = 'W30L'::text) AND ((workflow_state)::text = 'DRqueued'::text)) Total runtime: 75080.000 ms # d form_instances Table "public.form_instances" Column | Type | Modifiers -----------------+-----------------------------+------------------------------------------------------------- id | integer | not null default nextval('form_instances_id_seq'::regclass) form_id | character varying(255) | created_at | timestamp without time zone | updated_at | timestamp without time zone | created_by_id | integer | updated_by_id | integer | workflow | character varying(255) | workflow_state | character varying(255) | validation_data | text | Indexes: "form_instances_pkey" PRIMARY KEY,btree (id) "form_id_and_workflow_state" btree (form_id,workflow_state) "index_form_instances_on_form_id" btree (form_id) "index_form_instances_on_workflow_state" btree (workflow_state) # d field_instances Table "public.field_instances" Column | Type | Modifiers ------------------+-----------------------------+-------------------------------------------------------------- id | integer | not null default nextval('field_instances_id_seq'::regclass) form_instance_id | integer | created_at | timestamp without time zone | updated_at | timestamp without time zone | created_by_id | integer | updated_by_id | integer | field_id | character varying(255) | answer | text | state | character varying(255) | explanation | text | idx | integer | not null default 0 Indexes: "field_instances_pkey" PRIMARY KEY,btree (id) "field_instances__lower_answer" btree (lower(answer)) "index_field_instances_on_answer" btree (answer) "index_field_instances_on_field_id" btree (field_id) "index_field_instances_on_field_id_and_answer" btree (field_id,answer) "index_field_instances_on_form_instance_id" btree (form_instance_id) "index_field_instances_on_idx" btree (idx) 解决方法
以前是评论,但由于它似乎已经解决了问题,我将提出一个实际的答案.
系统估计可能有多少行是关闭的.我们可以看到,在第二个查询中,它估计了位图索引扫描中的3388行,但实际上得到了10457行. 所以你可能想真空全分析; 此外,其他命令可以极大地帮助包括reindex和/或集群. OP表示真空没有帮助,但重新指数确实如此. (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |