strange issue for certain queries

Mike Gargano <mrg@xxxxxxxxxxxxxxxxxx> · Thu, 15 Feb 2007 12:00:02 -0500

Hello,

	I'm experiencing some very unusual speed problems while executing a  
particular query.  Appending an empty string to one of the fields in  
the query speeds up the execution by a 1000 fold.  According to the  
planner this alternate query is much more costly too.  Here are the  
EXPLAIN ANALYZE results of the two queries (the preferred one first,  
the hack one second):

            QUERY PLAN
------------------------------------------------------------------------ 
------------------------------------------------------------------------ 
--------------------------------
Sort  (cost=1721.66..1721.66 rows=1 width=818) (actual  
time=3163.430..3163.433 rows=4 loops=1)
   Sort Key: COALESCE(keys.word2, keys.word1)
   ->  Nested Loop Left Join  (cost=1685.65..1721.65 rows=1  
width=818) (actual time=3161.971..3163.375 rows=4 loops=1)
         ->  Hash Join  (cost=1685.65..1716.71 rows=1 width=813)  
(actual time=3160.479..3160.784 rows=4 loops=1)
               Hash Cond: ("outer".pbx_id = "inner".pbx_id)
               ->  Nested Loop  (cost=1681.61..1712.64 rows=3  
width=813) (actual time=3093.394..3160.398 rows=40 loops=1)
                     ->  Nested Loop  (cost=1681.61..1700.79 rows=1  
width=817) (actual time=3093.185..3158.959 rows=40 loops=1)
                           ->  Bitmap Heap Scan on keys   
(cost=1681.61..1697.63 rows=1 width=21) (actual  
time=3092.935..3156.989 rows=40 loops=1)
                                 Recheck Cond: (((word0)::text =  
'2727'::text) AND ((feature)::text = 'ACD'::text))
                                 Filter: ((number = 0) AND (word1 IS  
NOT NULL))
                                 ->  BitmapAnd   
(cost=1681.61..1681.61 rows=4 width=0) (actual  
time=3090.806..3090.806 rows=0 loops=1)
                                       ->  Bitmap Index Scan on  
keys_word0_idx  (cost=0.00..5.00 rows=573 width=0) (actual  
time=2.295..2.295 rows=2616 loops=1)
                                             Index Cond:  
((word0)::text = '2727'::text)
                                       ->  Bitmap Index Scan on  
keys_feature_idx  (cost=0.00..1676.36 rows=277530 width=0) (actual  
time=3086.875..3086.875 rows=276590 loops=1)
                                             Index Cond:  
((feature)::text = 'ACD'::text)
                           ->  Index Scan using tnb_pkey on tnb   
(cost=0.00..3.15 rows=1 width=796) (actual time=0.043..0.045 rows=1  
loops=40)
                                 Index Cond: ("outer".tnb_id =  
tnb.tnb_id)
                     ->  Index Scan using has_tn_tnb_id_idx on  
has_tn  (cost=0.00..11.81 rows=3 width=12) (actual time=0.028..0.032  
rows=1 loops=40)
                           Index Cond: ("outer".tnb_id = has_tn.tnb_id)
               ->  Hash  (cost=4.03..4.03 rows=1 width=4) (actual  
time=0.257..0.257 rows=1 loops=1)
                     ->  Index Scan using request_to_pbx_ids_pkey on  
request_to_pbx_ids  (cost=0.00..4.03 rows=1 width=4) (actual  
time=0.217..0.221 rows=1 loops=1)
                           Index Cond: (request_id = 206335)
         ->  Index Scan using names_pkey on  
"names"  (cost=0.00..4.91 rows=1 width=29) (actual time=0.059..0.059  
rows=0 loops=4)
               Index Cond: (("outer".pbx_id = "names".pbx_id) AND  
(("outer".primary_dn)::text = ("names".primary_dn)::text))
Total runtime: 3164.147 ms
(25 rows)

         QUERY PLAN
------------------------------------------------------------------------ 
------------------------------------------------------------------------ 
--------------------------
Sort  (cost=12391.96..12391.96 rows=1 width=818) (actual  
time=4.020..4.023 rows=4 loops=1)
   Sort Key: COALESCE(keys.word2, keys.word1)
   ->  Nested Loop Left Join  (cost=0.00..12391.95 rows=1 width=818)  
(actual time=2.796..3.969 rows=4 loops=1)
         ->  Nested Loop  (cost=0.00..12387.01 rows=1 width=813)  
(actual time=2.055..2.234 rows=4 loops=1)
               ->  Nested Loop  (cost=0.00..12383.85 rows=1  
width=33) (actual time=2.026..2.138 rows=4 loops=1)
                     ->  Nested Loop  (cost=0.00..7368.29 rows=591  
width=12) (actual time=0.469..0.698 rows=73 loops=1)
                           ->  Index Scan using  
request_to_pbx_ids_pkey on request_to_pbx_ids  (cost=0.00..4.03  
rows=1 width=4) (actual time=0.215..0.217 rows=1 loops=1)
                                 Index Cond: (request_id = 206335)
                           ->  Index Scan using has_tn_pkey on  
has_tn  (cost=0.00..7319.43 rows=3586 width=12) (actual  
time=0.241..0.401 rows=73 loops=1)
                                 Index Cond: (has_tn.pbx_id =  
"outer".pbx_id)
                     ->  Index Scan using keys_pkey on keys   
(cost=0.00..8.47 rows=1 width=21) (actual time=0.018..0.018 rows=0  
loops=73)
                           Index Cond: ((keys.tnb_id =  
"outer".tnb_id) AND (keys.number = 0))
                           Filter: (((feature)::text = 'ACD'::text)  
AND (((word0)::text || ''::text) = '2727'::text) AND (word1 IS NOT  
NULL))
               ->  Index Scan using tnb_pkey on tnb   
(cost=0.00..3.15 rows=1 width=796) (actual time=0.018..0.020 rows=1  
loops=4)
                     Index Cond: ("outer".tnb_id = tnb.tnb_id)
         ->  Index Scan using names_pkey on  
"names"  (cost=0.00..4.91 rows=1 width=29) (actual time=0.017..0.017  
rows=0 loops=4)
               Index Cond: (("outer".pbx_id = "names".pbx_id) AND  
(("outer".primary_dn)::text = ("names".primary_dn)::text))
Total runtime: 4.624 ms
(18 rows)

As you can see the second query executes in about 4 ms as opposed to  
3000 ms.
The actual query is...
SELECT name, CASE WHEN raw_tn ~ '\n +SPV *\n' THEN '&check;' ELSE ''  
END AS supervisor, COALESCE(word2, word1) AS agent_id, des, tn_string 
(loop,shelf,card,unit) AS tn, display_type(tnb_id) FROM keys JOIN  
has_tn USING (tnb_id) JOIN tnb USING (tnb_id) JOIN request_to_pbx_ids  
USING (pbx_id) LEFT JOIN names USING (pbx_id, primary_dn) WHERE  
request_id = 206335 AND number = 0 AND feature = 'ACD' AND word0 =  
'2727' AND word1 IS NOT NULL ORDER BY agent_id USING <;

all that is done to obtain the second query is appending '' to word0  
in the WHERE clause, so it becomes word0 || '' = '2727'

This word0 field is index and the planner returns what I consider to  
be the correct plan.  The plan for the second query has a much higher  
cost, but it actually runs much faster.  Autovacuum is turned on and  
the database is vacuumed every 4 hrs.  and I just tried reindexing  
the whole table since there is very sparse data in the 40 million row  
table and the results remained the same.

Here are some of the settings for postgres...
autovacuum                      |  
on                                          | Starts the autovacuum  
subprocess.
autovacuum_analyze_scale_factor |  
0.2                                         | Number of tuple  
inserts, updates or deletes prior to analyze as a fract
ion of reltuples.
autovacuum_analyze_threshold    |  
500                                         | Minimum number of tuple  
inserts, updates or deletes prior to analyze.
autovacuum_naptime              |  
60                                          | Time to sleep between  
autovacuum runs, in seconds.
autovacuum_vacuum_cost_delay    |  
-1                                          | Vacuum cost delay in  
milliseconds, for autovacuum.
autovacuum_vacuum_cost_limit    |  
-1                                          | Vacuum cost amount  
available before napping, for autovacuum.
autovacuum_vacuum_scale_factor  |  
0.4                                         | Number of tuple updates  
or deletes prior to vacuum as a fraction of rel
tuples.
autovacuum_vacuum_threshold     |  
1000                                        | Minimum number of tuple  
updates or deletes prior to vacuum.
backslash_quote                 |  
safe_encoding                               | Sets whether "\'" is  
allowed in string literals.
bgwriter_all_maxpages           |  
5                                           | Background writer  
maximum number of all pages to flush per round
bgwriter_all_percent            |  
0.333                                       | Background writer  
percentage of all buffers to flush per round
bgwriter_delay                  |  
200                                         | Background writer sleep  
time between rounds in milliseconds
bgwriter_lru_maxpages           |  
5                                           | Background writer  
maximum number of LRU pages to flush per round
bgwriter_lru_percent            |  
1                                           | Background writer  
percentage of LRU buffers to flush per round
block_size                      | 8192
constraint_exclusion            |  
off                                         | Enables the planner to  
use constraints to optimize queries.
cpu_index_tuple_cost            |  
0.001                                       | Sets the planner's  
estimate of processing cost for each index tuple (ro
w) during index scan.
cpu_operator_cost               |  
0.0025                                      | Sets the planner's  
estimate of processing cost of each operator in WHER
E.
cpu_tuple_cost                  | 0.01
full_page_writes                |  
on                                          | Writes full pages to  
WAL when first modified after a checkpoint.
geqo                            |  
on                                          | Enables genetic query  
optimization.
geqo_effort                     |  
5                                           | GEQO: effort is used to  
set the default for other GEQO parameters.
geqo_generations                |  
0                                           | GEQO: number of  
iterations of the algorithm.
geqo_pool_size                  |  
0                                           | GEQO: number of  
individuals in the population.
geqo_selection_bias             |  
2                                           | GEQO: selective  
pressure within the population.
geqo_threshold                  | 12
join_collapse_limit             | 8
server_encoding                 |  
SQL_ASCII                                   | Sets the server  
(database) character set encoding.
server_version                  |  
8.1.4                                       | Shows the server version.
shared_buffers                  | 400
vacuum_cost_delay               |  
0                                           | Vacuum cost delay in  
milliseconds.
vacuum_cost_limit               |  
200                                         | Vacuum cost amount  
available before napping.
vacuum_cost_page_dirty          |  
20                                          | Vacuum cost for a page  
dirtied by vacuum.
vacuum_cost_page_hit            |  
1                                           | Vacuum cost for a page  
found in the buffer cache.
vacuum_cost_page_miss           |  
10                                          | Vacuum cost for a page  
not found in the buffer cache.
wal_buffers                     |  
8                                           | Sets the number of disk- 
page buffers in shared memory for WAL.
wal_sync_method                 |  
fsync                                       | Selects the method used  
for forcing WAL updates out to disk.
work_mem                        |  
1024                                        | Sets the maximum memory  
to be used for query workspaces.
zero_damaged_pages              | off

Thanks in advance for any help you can offer on this problem.

-Mike