Re: query plan with index having a btrim is different for strings of different length

Richard Yen <dba@xxxxxxxxxxx> · Wed, 10 Dec 2008 13:54:50 -0800

On Dec 10, 2008, at 11:34 AM, Tom Lane wrote:

Richard Yen <dba@xxxxxxxxxxx> writes:
You guys are right.  I tried "Miller" and gave me the same result.   
Is
there any way to tune this so that for the common last names, the
query run time doesn't jump from <1s to >300s?

If the planner's estimation is that far off then there must be  
something
very weird about the table statistics, but you haven't given us any  
clue
what.

Wow, thanks for helping me out here.  I don't have much experience  
with deconstructing queries and working with stats, so here's what I  
could gather.  If you need more information, please let me know.

tii=# select * from pg_stat_all_tables where relname =  
'm_object_paper' or relname = 'm_assignment';
-[ RECORD 1 ]----+------------------------------
relid            | 17516
schemaname       | public
relname          | m_assignment
seq_scan         | 274
seq_tup_read     | 1039457272
idx_scan         | 372379230
idx_tup_fetch    | 2365235708
n_tup_ins        | 5641638
n_tup_upd        | 520684
n_tup_del        | 30339
n_tup_hot_upd    | 406929
n_live_tup       | 5611665
n_dead_tup       | 11877
last_vacuum      |
last_autovacuum  | 2008-12-04 17:44:57.309717-08
last_analyze     | 2008-10-20 15:09:50.943652-07
last_autoanalyze | 2008-08-15 17:16:14.588153-07
-[ RECORD 2 ]----+------------------------------
relid            | 17792
schemaname       | public
relname          | m_object_paper
seq_scan         | 83613
seq_tup_read     | 184330159906
idx_scan         | 685219945
idx_tup_fetch    | 222892138627
n_tup_ins        | 71564825
n_tup_upd        | 27558792
n_tup_del        | 3058
n_tup_hot_upd    | 22410985
n_live_tup       | 71559627
n_dead_tup       | 585467
last_vacuum      | 2008-10-24 14:36:45.134936-07
last_autovacuum  | 2008-12-05 07:02:40.52712-08
last_analyze     | 2008-11-25 14:42:04.185798-08
last_autoanalyze | 2008-08-15 17:20:28.42811-07

tii=# select * from pg_statio_all_tables where relname =  
'm_object_paper' or relname = 'm_assignment';
-[ RECORD 1 ]---+---------------
relid           | 17516
schemaname      | public
relname         | m_assignment
heap_blks_read  | 22896372
heap_blks_hit   | 1753777105
idx_blks_read   | 7879634
idx_blks_hit    | 1157729592
toast_blks_read | 0
toast_blks_hit  | 0
tidx_blks_read  | 0
tidx_blks_hit   | 0
-[ RECORD 2 ]---+---------------
relid           | 17792
schemaname      | public
relname         | m_object_paper
heap_blks_read  | 2604944369
heap_blks_hit   | 116307527781
idx_blks_read   | 133534908
idx_blks_hit    | 3601637440
toast_blks_read | 0
toast_blks_hit  | 0
tidx_blks_read  | 0
tidx_blks_hit   | 0

Also, yes, we've kicked around the idea of doing an index on the  
concatenation of the first and last names--that would definitely be  
more unique, and I think we're actually going to move to that.  Just  
thought I'd dig deeper here to learn more.

Thanks!
--Richard

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance