Hi,
I am using Postgres version 9.4.4 on a Mac machine. I have 2 queries that differ only in the order by clause. One of it has 'nulls last' and the other one does not have it. The performance difference between the two is considerable.
The slower of the two queries is
The slower of the two queries is
SELECT wos.notificationstatus,
wos.unrepliedcount,
wos.shownotestotech,
wos.ownerid,
wos.isfcr,
aau.user_id,
wo.workorderid AS "WOID",
wo.is_catalog_template AS "TemplateType",
wo.title AS "Title",
wo.is_catalog_template,
aau.first_name AS "Requester",
cd.categoryname AS "Category",
ti.first_name AS "Owner",
wo.duebytime AS "DueBy",
wo.fr_duetime,
wo.completedtime AS "CompletedTime",
wo.respondedtime AS "RespondedTime",
wo.resolvedtime AS "ResolvedTime",
qd.queuename AS "Group",
std.statusname AS "Status",
wo.createdtime AS "CreatedDate",
wos.isread,
wos.hasattachment,
wos.appr_statusid,
wos.priorityid,
wo.templateid AS "TemplateId",
pd.priorityid,
pd.priorityname AS "Priority",
pd.prioritycolor AS "PriorityColor",
wos.isoverdue,
wos.is_fr_overdue,
wos.linkedworkorderid,
wos.editing_status,
wos.editorid,
wos.linkedworkorderid,
wo.isparent,
sduser.isvipuser,
sduser_onbehalfof.isvipuser AS "ONBEHALFOFVIP",
wo.isparent,
wos.statusid,
sdorganization.name AS "Site",
wo.workorderid AS "RequestID"
FROM workorder wo
left join workorder_fields wof
ON wo.workorderid=wof.workorderid
left join servicecatalog_fields scf
ON wo.workorderid=scf.workorderid
left join wotoprojects wtp
ON wo.workorderid=wtp.workorderid
left join sitedefinition
ON wo.siteid=sitedefinition.siteid
left join sdorganization
ON sitedefinition.siteid=sdorganization.org_id
inner join workorderstates wos
ON wo.workorderid=wos.workorderid
left join categorydefinition cd
ON wos.categoryid=cd.categoryid
left join aaauser ti
ON wos.ownerid=ti.user_id
left join aaauser aau
ON wo.requesterid=aau.user_id
left join prioritydefinition pd
ON wos.priorityid=pd.priorityid
left join statusdefinition std
ON wos.statusid=std.statusid
left join workorder_queue wo_queue
ON wo.workorderid=wo_queue.workorderid
left join queuedefinition qd
ON wo_queue.queueid=qd.queueid
left join departmentdefinition dpt
ON wo.deptid=dpt.deptid
left join leveldefinition lvd
ON wos.levelid=lvd.levelid
left join modedefinition mdd
ON wo.modeid=mdd.modeid
left join urgencydefinition urgdef
ON wos.urgencyid=urgdef.urgencyid
left join impactdefinition impdef
ON wos.impactid=impdef.impactid
left join requesttypedefinition rtdef
ON wos.requesttypeid=rtdef.requesttypeid
left join subcategorydefinition scd
ON wos.subcategoryid=scd.subcategoryid
left join itemdefinition icd
ON wos.itemid=icd.itemid
left join servicedefinition serdef
ON wo.serviceid=serdef.serviceid
left join aaauser cbau
ON wo.createdbyid=cbau.user_id
left join aaauser oboaau
ON wo.oboid=oboaau.user_id
left join sduser
ON wo.requesterid=sduser.userid
left join sduser sduser_onbehalfof
ON wo.oboid=sduser_onbehalfof.userid
left join workorder_fields
ON wo.workorderid=workorder_fields.workorderid
WHERE ((
wos.statusid = 1)
AND (
wo.isparent = TRUE))
ORDER BY 7 DESC nulls last limit 25
On removing 'nulls last' from the order by clause the query becomes very fast. I have attached the query plan for both the queries.
From the plan it looks like the second query is able to efficiently use the workorder_pk index ( The node 'Index Scan Backward using workorder_pk on workorder' returns 25 rows) whereas the first query is not able to use the index efficiently (more than 300k rows are returned from the same node).
The column workorderid is a PK column. The query optimizer should ideally know that there is no nulls in this column and in effect there is no difference between the two queries.
I tried the same in Postgres 10 and the slower query performs much better due to parallel sequential scans but still it is less efficient than the query without 'nulls last'.
I thought it would be best to raise this with the Postgres team.
Regards,
From the plan it looks like the second query is able to efficiently use the workorder_pk index ( The node 'Index Scan Backward using workorder_pk on workorder' returns 25 rows) whereas the first query is not able to use the index efficiently (more than 300k rows are returned from the same node).
The column workorderid is a PK column. The query optimizer should ideally know that there is no nulls in this column and in effect there is no difference between the two queries.
I tried the same in Postgres 10 and the slower query performs much better due to parallel sequential scans but still it is less efficient than the query without 'nulls last'.
I thought it would be best to raise this with the Postgres team.
Regards,
Nanda
pg_9_4_Fast QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=7.00..79.41 rows=25 width=268) (actual time=0.568..1.348 rows=25 loops=1) -> Nested Loop Left Join (cost=7.00..250829.09 rows=86592 width=268) (actual time=0.567..1.343 rows=25 loops=1) -> Nested Loop Left Join (cost=6.71..223240.48 rows=86592 width=275) (actual time=0.565..1.320 rows=25 loops=1) -> Nested Loop Left Join (cost=6.42..195651.87 rows=86592 width=282) (actual time=0.556..1.198 rows=25 loops=1) Join Filter: (wos.statusid = std.statusid) -> Nested Loop Left Join (cost=6.42..194351.80 rows=86592 width=273) (actual time=0.411..1.022 rows=25 loops=1) -> Nested Loop Left Join (cost=6.29..180268.60 rows=86592 width=253) (actual time=0.409..0.999 rows=25 loops=1) -> Nested Loop Left Join (cost=6.00..152702.85 rows=86592 width=234) (actual time=0.402..0.891 rows=25 loops=1) -> Nested Loop Left Join (cost=5.71..124414.37 rows=86592 width=223) (actual time=0.390..0.601 rows=25 loops=1) Join Filter: (wo.workorderid = wo_queue.workorderid) Rows Removed by Join Filter: 25 -> Nested Loop Left Join (cost=4.69..123112.82 rows=86592 width=208) (actual time=0.042..0.230 rows=25 loops=1) -> Nested Loop Left Join (cost=4.56..109029.62 rows=86592 width=205) (actual time=0.039..0.208 rows=25 loops=1) -> Nested Loop Left Join (cost=4.41..83014.38 rows=86592 width=205) (actual time=0.037..0.187 rows=25 loops=1) -> Nested Loop Left Join (cost=4.27..68290.31 rows=86592 width=205) (actual time=0.035..0.163 rows=25 loops=1) -> Merge Join (cost=4.12..36022.21 rows=86592 width=205) (actual time=0.030..0.123 rows=25 loops=1) Merge Cond: (wo.workorderid = wos.workorderid) -> Index Scan Backward using workorder_pk on workorder wo (cost=0.42..18645.63 rows=302945 width=157) (actual time=0.013..0.048 rows=25 loops=1) Filter: isparent -> Index Scan Backward using workorderstates_fk1_idx on workorderstates wos (cost=0.42..15538.74 rows=86592 width=136) (actual time=0.012..0.034 rows=25 loops=1) Filter: (statusid = 1) -> Index Only Scan using wotoprojects_fk1_idx on wotoprojects wtp (cost=0.15..0.29 rows=8 width=8) (actual time=0.001..0.001 rows=0 loops=25) Index Cond: (workorderid = wo.workorderid) Heap Fetches: 0 -> Index Only Scan using sitedefinition_fk2_idx on sitedefinition (cost=0.14..0.16 rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=25) Index Cond: (siteid = wo.siteid) Heap Fetches: 0 -> Index Scan using sdorganization_pk on sdorganization (cost=0.14..0.29 rows=1 width=16) (actual time=0.000..0.000 rows=0 loops=25) Index Cond: (sitedefinition.siteid = org_id) -> Index Scan using categorydefinition_pk on categorydefinition cd (cost=0.14..0.15 rows=1 width=19) (actual time=0.000..0.000 rows=0 loops=25) Index Cond: (wos.categoryid = categoryid) -> Materialize (cost=1.02..2.67 rows=1 width=23) (actual time=0.013..0.014 rows=1 loops=25) -> Hash Right Join (cost=1.02..2.67 rows=1 width=23) (actual time=0.324..0.341 rows=1 loops=1) Hash Cond: (qd.queueid = wo_queue.queueid) -> Seq Scan on queuedefinition qd (cost=0.00..1.46 rows=46 width=23) (actual time=0.154..0.161 rows=46 loops=1) -> Hash (cost=1.01..1.01 rows=1 width=16) (actual time=0.158..0.158 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Seq Scan on workorder_queue wo_queue (cost=0.00..1.01 rows=1 width=16) (actual time=0.154..0.155 rows=1 loops=1) -> Index Scan using aaauser_pk on aaauser ti (cost=0.29..0.32 rows=1 width=19) (actual time=0.010..0.011 rows=1 loops=25) Index Cond: (wos.ownerid = user_id) -> Index Scan using aaauser_pk on aaauser aau (cost=0.29..0.31 rows=1 width=19) (actual time=0.003..0.003 rows=1 loops=25) Index Cond: (wo.requesterid = user_id) -> Index Scan using prioritydefinition_pk on prioritydefinition pd (cost=0.14..0.15 rows=1 width=20) (actual time=0.000..0.000 rows=0 loops=25) Index Cond: (wos.priorityid = priorityid) -> Materialize (cost=0.00..1.19 rows=1 width=17) (actual time=0.006..0.006 rows=1 loops=25) -> Seq Scan on statusdefinition std (cost=0.00..1.19 rows=1 width=17) (actual time=0.142..0.145 rows=1 loops=1) Filter: (statusid = 1) Rows Removed by Filter: 14 -> Index Scan using sduser_fk1_idx on sduser (cost=0.29..0.31 rows=1 width=9) (actual time=0.003..0.004 rows=1 loops=25) Index Cond: (wo.requesterid = userid) -> Index Scan using sduser_fk1_idx on sduser sduser_onbehalfof (cost=0.29..0.31 rows=1 width=9) (actual time=0.000..0.000 rows=0 loops=25) Index Cond: (wo.oboid = userid) Planning time: 36.746 ms Execution time: 1.710 ms (54 rows)
pg_9_4_Slow QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=51055.44..51055.50 rows=25 width=268) (actual time=2746.814..2746.819 rows=25 loops=1) -> Sort (cost=51055.44..51271.92 rows=86592 width=268) (actual time=2746.812..2746.813 rows=25 loops=1) Sort Key: wo.workorderid Sort Method: top-N heapsort Memory: 31kB -> Hash Left Join (cost=2800.05..48611.88 rows=86592 width=268) (actual time=49.285..2634.867 rows=86130 loops=1) Hash Cond: (wo.oboid = sduser_onbehalfof.userid) -> Hash Left Join (cost=2120.45..47174.59 rows=86592 width=275) (actual time=36.652..2574.275 rows=86130 loops=1) Hash Cond: (wo.requesterid = sduser.userid) -> Nested Loop Left Join (cost=1440.86..44871.40 rows=86592 width=282) (actual time=23.899..2449.283 rows=86130 loops=1) Join Filter: (wos.statusid = std.statusid) -> Hash Left Join (cost=1440.86..43571.33 rows=86592 width=273) (actual time=23.889..2372.958 rows=86130 loops=1) Hash Cond: (wos.priorityid = pd.priorityid) -> Hash Left Join (cost=1439.52..42528.00 rows=86592 width=253) (actual time=23.848..2323.138 rows=86130 loops=1) Hash Cond: (wo.requesterid = aau.user_id) -> Hash Left Join (cost=779.93..40244.81 rows=86592 width=234) (actual time=11.128..2198.954 rows=86130 loops=1) Hash Cond: (wos.ownerid = ti.user_id) -> Hash Left Join (cost=120.34..38058.72 rows=86592 width=223) (actual time=0.280..2106.865 rows=86130 loops=1) Hash Cond: (wo.workorderid = wo_queue.workorderid) -> Hash Left Join (cost=117.66..37731.31 rows=86592 width=208) (actual time=0.239..2060.880 rows=86130 loops=1) Hash Cond: (wos.categoryid = cd.categoryid) -> Hash Left Join (cost=116.37..36687.54 rows=86592 width=205) (actual time=0.218..2013.580 rows=86130 loops=1) Hash Cond: (wo.siteid = sitedefinition.siteid) -> Merge Left Join (cost=108.95..36355.39 rows=86592 width=205) (actual time=0.041..1974.608 rows=86130 loops=1) Merge Cond: (wo.workorderid = wtp.workorderid) -> Merge Join (cost=4.12..36022.21 rows=86592 width=205) (actual time=0.032..1940.839 rows=86130 loops=1) Merge Cond: (wo.workorderid = wos.workorderid) -> Index Scan Backward using workorder_pk on workorder wo (cost=0.42..18645.63 rows=302945 width=157) (actual time=0.015..1020.405 rows=302945 loops=1) Filter: isparent -> Index Scan Backward using workorderstates_fk1_idx on workorderstates wos (cost=0.42..15538.74 rows=86592 width=136) (actual time=0.012..765.145 rows=86130 loops=1) Filter: (statusid = 1) Rows Removed by Filter: 216815 -> Sort (cost=104.83..108.61 rows=1510 width=8) (actual time=0.008..0.008 rows=0 loops=1) Sort Key: wtp.workorderid Sort Method: quicksort Memory: 25kB -> Seq Scan on wotoprojects wtp (cost=0.00..25.10 rows=1510 width=8) (actual time=0.000..0.000 rows=0 loops=1) -> Hash (cost=6.27..6.27 rows=92 width=16) (actual time=0.165..0.165 rows=92 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 5kB -> Hash Right Join (cost=3.07..6.27 rows=92 width=16) (actual time=0.072..0.128 rows=92 loops=1) Hash Cond: (sdorganization.org_id = sitedefinition.siteid) -> Seq Scan on sdorganization (cost=0.00..1.93 rows=93 width=16) (actual time=0.005..0.012 rows=93 loops=1) -> Hash (cost=1.92..1.92 rows=92 width=8) (actual time=0.055..0.055 rows=92 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 4kB -> Seq Scan on sitedefinition (cost=0.00..1.92 rows=92 width=8) (actual time=0.004..0.019 rows=92 loops=1) -> Hash (cost=1.13..1.13 rows=13 width=19) (actual time=0.013..0.013 rows=13 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Seq Scan on categorydefinition cd (cost=0.00..1.13 rows=13 width=19) (actual time=0.003..0.006 rows=13 loops=1) -> Hash (cost=2.67..2.67 rows=1 width=23) (actual time=0.033..0.033 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Hash Right Join (cost=1.02..2.67 rows=1 width=23) (actual time=0.016..0.032 rows=1 loops=1) Hash Cond: (qd.queueid = wo_queue.queueid) -> Seq Scan on queuedefinition qd (cost=0.00..1.46 rows=46 width=23) (actual time=0.004..0.007 rows=46 loops=1) -> Hash (cost=1.01..1.01 rows=1 width=16) (actual time=0.003..0.003 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Seq Scan on workorder_queue wo_queue (cost=0.00..1.01 rows=1 width=16) (actual time=0.002..0.002 rows=1 loops=1) -> Hash (cost=400.93..400.93 rows=20693 width=19) (actual time=10.824..10.824 rows=20693 loops=1) Buckets: 4096 Batches: 1 Memory Usage: 1132kB -> Seq Scan on aaauser ti (cost=0.00..400.93 rows=20693 width=19) (actual time=0.006..4.313 rows=20693 loops=1) -> Hash (cost=400.93..400.93 rows=20693 width=19) (actual time=12.689..12.689 rows=20693 loops=1) Buckets: 4096 Batches: 1 Memory Usage: 1040kB -> Seq Scan on aaauser aau (cost=0.00..400.93 rows=20693 width=19) (actual time=0.005..5.184 rows=20693 loops=1) -> Hash (cost=1.15..1.15 rows=15 width=20) (actual time=0.027..0.027 rows=15 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Seq Scan on prioritydefinition pd (cost=0.00..1.15 rows=15 width=20) (actual time=0.017..0.021 rows=15 loops=1) -> Materialize (cost=0.00..1.19 rows=1 width=17) (actual time=0.000..0.000 rows=1 loops=86130) -> Seq Scan on statusdefinition std (cost=0.00..1.19 rows=1 width=17) (actual time=0.004..0.008 rows=1 loops=1) Filter: (statusid = 1) Rows Removed by Filter: 14 -> Hash (cost=420.93..420.93 rows=20693 width=9) (actual time=12.726..12.726 rows=20693 loops=1) Buckets: 4096 Batches: 1 Memory Usage: 970kB -> Seq Scan on sduser (cost=0.00..420.93 rows=20693 width=9) (actual time=0.006..5.942 rows=20693 loops=1) -> Hash (cost=420.93..420.93 rows=20693 width=9) (actual time=12.584..12.584 rows=20693 loops=1) Buckets: 4096 Batches: 1 Memory Usage: 970kB -> Seq Scan on sduser sduser_onbehalfof (cost=0.00..420.93 rows=20693 width=9) (actual time=0.005..5.884 rows=20693 loops=1) Planning time: 69.901 ms Execution time: 2749.108 ms (75 rows)