Search Postgresql Archives

enable_sort optimization problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



(8.0.1 on debian/linux 2.6.11 kernel)

I have noticed that if I set enable_sort=false in the .conf file, my queries are running faster. I had a query which if I did a limit 20, ran in 6 milliseconds, but if I changed it to limit 21, it took around 19 seconds (or 19000 milliseconds). It also took longer if I did limit 19 offset 2. (I don't know what it is about the 21st record). In any case, I noticed that in the analysis, the long version was doing a sort and the quick version was not, so I tried the enable_sort=false setting, and now things are generally running faster.

I HAVE done analyzes, and vacuums, and vacuum analyzes. In prior experimenting with this, there were even some seq_scans, which turned into index_scans when I set enable_seqscan=false, and became moderately faster.

I am using 8.0.1, and below are the two query plans, first the enable_sort=true version, then the enable_sort=false version, note the VAST difference in speed. What is the problem, and how can I convince the query optimizer to do the right thing (short of enable_sort=false)?

from the config file:
# - Planner Cost Constants -

#effective_cache_size = 1000    # typically 8KB each
#random_page_cost = 4           # units are one sequential page fetch cost
#cpu_tuple_cost = 0.01          # (same)
#cpu_index_tuple_cost = 0.001   # (same)
#cpu_operator_cost = 0.0025     # (same)


With enable_sort=true: (the default)
opennms=# explain analyze SELECT OUTAGES.*, NODE.NODELABEL, IPINTERFACE.IPHOSTNAME, SERVICE.SERVICENAME, NOTIFICATIONS.NOTIFYID, NOTIFICATIONS.ANSWEREDBY FROM OUTAGES JOIN NODE USING(NODEID) JOIN IPINTERFACE ON OUTAGES.NODEID=IPINTERFACE.NODEID AND OUTAGES.IPADDR=IPINTERFACE.IPADDR LEFT OUTER JOIN SERVICE ON OUTAGES.SERVICEID=SERVICE.SERVICEID LEFT OUTER JOIN NOTIFICATIONS ON SVCLOSTEVENTID=NOTIFICATIONS.EVENTID ORDER BY OUTAGEID DESC LIMIT 29 OFFSET 2;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=127422.75..127422.82 rows=29 width=126) (actual time=40858.260..40858.608 rows=29 loops=1)
-> Sort (cost=127422.74..127423.14 rows=161 width=126) (actual time=40858.230..40858.367 rows=31 loops=1)
Sort Key: outages.outageid
-> Nested Loop Left Join (cost=1047.62..127416.84 rows=161 width=126) (actual time=2448.671..38122.920 rows=134579 loops=1)
-> Hash Left Join (cost=1047.62..126533.18 rows=161 width=102) (actual time=2448.616..23963.962 rows=132250 loops=1)
Hash Cond: ("outer".serviceid = "inner".serviceid)
-> Hash Join (cost=1044.17..126527.32 rows=161 width=92) (actual time=1996.055..21178.268 rows=132250 loops=1)
Hash Cond: (("outer".nodeid = "inner".nodeid) AND (("outer".ipaddr)::text = ("inner".ipaddr)::text))
-> Index Scan using outages_serviceid_idx on outages (cost=0.00..123497.86 rows=132245 width=52) (actual time=81.413..15330.873 rows=132250 loops=1)
-> Hash (cost=1037.31..1037.31 rows=1372 width=64) (actual time=1914.558..1914.558 rows=0 loops=1)
-> Merge Join (cost=0.00..1037.31 rows=1372 width=64) (actual time=150.436..1906.441 rows=1372 loops=1)
Merge Cond: ("outer".nodeid = "inner".nodeid)
-> Index Scan using pk_nodeid on node (cost=0.00..229.55 rows=1043 width=24) (actual time=0.022..623.197 rows=1043 loops=1)
-> Index Scan using ipinterface_nodeid_idx on ipinterface (cost=0.00..788.01 rows=1372 width=40) (actual time=150.369..1259.366 rows=1372 loops=1)
-> Hash (cost=3.36..3.36 rows=33 width=14) (actual time=452.519..452.519 rows=0 loops=1)
-> Index Scan using pk_serviceid on service (cost=0.00..3.36 rows=33 width=14) (actual time=452.193..452.369 rows=33 loops=1)
-> Index Scan using notifications_eventid_idx on notifications (cost=0.00..5.46 rows=2 width=28) (actual time=0.078..0.079 rows=0 loops=132250)
Index Cond: ("outer".svclosteventid = notifications.eventid)
Total runtime: 40868.130 ms
(19 rows)



And the enable_sort=false version:
opennms=# explain analyze SELECT OUTAGES.*, NODE.NODELABEL, IPINTERFACE.IPHOSTNAME, SERVICE.SERVICENAME, NOTIFICATIONS.NOTIFYID, NOTIFICATIONS.ANSWEREDBY FROM OUTAGES JOIN NODE USING(NODEID) JOIN IPINTERFACE ON OUTAGES.NODEID=IPINTERFACE.NODEID AND OUTAGES.IPADDR=IPINTERFACE.IPADDR LEFT OUTER JOIN SERVICE ON OUTAGES.SERVICEID=SERVICE.SERVICEID LEFT OUTER JOIN NOTIFICATIONS ON SVCLOSTEVENTID=NOTIFICATIONS.EVENTID ORDER BY OUTAGEID DESC LIMIT 29 OFFSET 2;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=12600.25..195303.93 rows=29 width=126) (actual time=211.185..300.336 rows=29 loops=1)
-> Nested Loop Left Join (cost=0.00..1014320.39 rows=161 width=126) (actual time=147.328..300.072 rows=31 loops=1)
-> Nested Loop Left Join (cost=0.00..1013436.73 rows=161 width=102) (actual time=147.264..298.671 rows=31 loops=1)
-> Nested Loop (cost=0.00..1012949.54 rows=161 width=92) (actual time=137.286..287.138 rows=31 loops=1)
-> Nested Loop (cost=0.00..1012208.81 rows=159 width=76) (actual time=126.277..254.709 rows=31 loops=1)
-> Index Scan Backward using pk_outageid on outages (cost=0.00..252480.62 rows=132245 width=52) (actual time=77.021..77.484 rows=31 loops=1)
-> Index Scan using ipinterface_nodeid_ipaddr_ismanaged_idx on ipinterface (cost=0.00..5.73 rows=1 width=40) (actual time=5.304..5.686 rows=1 loops=31)
Index Cond: (("outer".nodeid = ipinterface.nodeid) AND (("outer".ipaddr)::text = (ipinterface.ipaddr)::text))
-> Index Scan using pk_nodeid on node (cost=0.00..4.65 rows=1 width=24) (actual time=1.010..1.015 rows=1 loops=31)
Index Cond: ("outer".nodeid = node.nodeid)
-> Index Scan using pk_serviceid on service (cost=0.00..3.01 rows=1 width=14) (actual time=0.337..0.344 rows=1 loops=31)
Index Cond: ("outer".serviceid = service.serviceid)
-> Index Scan using notifications_eventid_idx on notifications (cost=0.00..5.46 rows=2 width=28) (actual time=0.019..0.019 rows=0 loops=31)
Index Cond: ("outer".svclosteventid = notifications.eventid)
Total runtime: 300.856 ms
(15 rows)


I'm much happier with 300ms vs 40868ms. Am I doing any harm by leaving enable_sort=false? (I assume it will still use sort if it has no choice?) More importantly for the long term, why is the optimizer making such a blunder? (I notice just now that if I remove the limit clause completely, even the enable_sort=false version takes a very long time, so it appears the optimizer isn't handling limit well).

p.s. My only relationship to opennms is as a user (one who is trying to make some reports go faster).

With limit 50 offset 200 variation (with enable_sort=false): Total runtime: 59.288 ms
With limit 50 offset 200 variation (with enable_sort=true): Total runtime: 18764.598 ms
No limit or offset and enable_sort=true: Total runtime: 19510.191 ms
No limit or offset and enable_sort=false: Total runtime: 53530.251 ms


Hm, disabling sort seems to hinder non limited queries, so this seems to be a catch-22.


---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@xxxxxxxxxxxxxx

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux