Search Postgresql Archives

Re: Postgresql selecting strange index for simple query

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Maxim Boguk <mboguk@xxxxxxxxxxxxx> writes:
Tom Lane wrote:
It does know better than that.  I'm wondering if the single-column index
has become very bloated or something.  Have you compared the physical
index sizes?

Table fresh loaded from dump on test server... So no index bloat for sure...
As for comparing physical sizes, right single column index indeed smaller then wrong one:

Huh.  I get sane-looking choices when I try a similar case here.  Can
you put together a self-contained test case?

Not full self-contained test case but some minimal setup data which can give you ideas whats going wrong:
Test confirm my theory wrong index selection linked with long rows in table.

My tests contain such queries:

Ok TEST1:
set random_page_cost=1;
drop TABLE if exists test_table ;
SELECT user_id,last_change_time,rpad('a',1,'b') as f1 into test_table from resume;
ANALYZE test_table;
SELECT count(*) from test_table;
CREATE INDEX right_idx on test_table(last_change_time);
CREATE INDEX wrong_idx on test_table(user_id, last_change_time);
EXPLAIN ANALYZE SELECT 1 from test_table where last_change_time > '2009-01-10 00:00:00';
result:
 Index Scan using right_idx on test_table  (cost=0.00..42763.35 rows=388718 width=0) (actual time=0.020..342.653 rows=390370 loops=1)
   Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
 Total runtime: 368.699 ms



Ok TEST2 (but see: cost increased 4x times when real work time increased only by 30%):
same but rpad('a',200,'b') instead of rpad('a',1,'b') when populating test_table:
result:
 Index Scan using right_idx on test_table  (cost=0.00..179346.09 rows=392268 width=0) (actual time=0.089..422.439 rows=390370 loops=1)
   Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
 Total runtime: 448.717 ms


!!Not ok TEST3:!!
same but rpad('a',500,'b') instead of rpad('a',1,'b') when populating test_table:
Oops wrong index used:
 Index Scan using wrong_idx on test_table  (cost=0.00..254918.19 rows=392231 width=0) (actual time=0.067..730.097 rows=390370 loops=1)
   Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
 Total runtime: 757.930 ms
(3 rows)
(btw if drop wrong_idx query become works almost 2х faster:
drop INDEX wrong_idx;
 Index Scan using right_idx on test_table  (cost=0.00..259709.09 rows=392231 width=0) (actual time=0.019..416.108 rows=390370 loops=1)
   Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
 Total runtime: 442.790 ms
)


Again Ok TEST4 (here toast engine removed all long values from test_table):
same but rpad('a',2000,'b') instead of rpad('a',1,'b') when populating test_table:
 Index Scan using right_idx on test_table  (cost=0.00..64606.50 rows=393002 width=0) (actual time=0.058..371.723 rows=390370 loops=1)
   Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
 Total runtime: 397.929 ms


So i have two theory (just waving hands ofcourse):
1)integer owerflow somewhere in cost calculation
2)floating rounding errors (because cost very close in wrong situations: 254918.19 vs 259709.09)

PS: second issue is cost increasing with increasing lenght of rows in table complete indepenent with increase real work time.
(probably i need tune other _cost parameters for get more exact cost values)

PPS: sorry for my poor english

Regards, Maxim Boguk

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux