Scott Marlowe wrote:
It's not really solved, it's just a happy coincidence that the current
plan runs well. In order to keep the query planner making good
choices you need to increase stats target for the field in the index
above. The easiest way to do so is to do this:
alter database mydb set default_statistics_target=100;
and run analyze again:
analyze;
So, i removed the index on field_name, set
default_default_statistics_target to 100, analyzed, and the results are
the same:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=1.04..58676.31 rows=218048 width=68) (actual
time=0.067..12268.394 rows=6 loops=1)
Hash Cond: ((u.field_name)::text = (t.key)::text)
-> Seq Scan on photo_info_data u (cost=0.00..47500.30 rows=2398530
width=50) (actual time=0.013..6426.611 rows=2398446 loops=1)
-> Hash (cost=1.02..1.02 rows=2 width=18) (actual
time=0.015..0.015 rows=2 loops=1)
-> Seq Scan on t_query_data t (cost=0.00..1.02 rows=2
width=18) (actual time=0.002..0.006 rows=2 loops=1)
Total runtime: 12268.459 ms
(6 rows)
I even changed default_statistics_target to 1000:
------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=1.04..58580.29 rows=208561 width=67) (actual
time=0.054..12434.283 rows=6 loops=1)
Hash Cond: ((u.field_name)::text = (t.key)::text)
-> Seq Scan on photo_info_data u (cost=0.00..47499.46 rows=2398446
width=49) (actual time=0.012..6129.923 rows=2398446 loops=1)
-> Hash (cost=1.02..1.02 rows=2 width=18) (actual
time=0.015..0.015 rows=2 loops=1)
-> Seq Scan on t_query_data t (cost=0.00..1.02 rows=2
width=18) (actual time=0.002..0.004 rows=2 loops=1)
Total runtime: 12434.338 ms
(6 rows)
Even when I run this query, I get sequential scan:
explain analyze select * from photo_info_data where field_name =
'f-spot' or field_name = 'shutter';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Seq Scan on photo_info_data (cost=0.00..59491.69 rows=1705 width=49)
(actual time=0.018..1535.963 rows=6 loops=1)
Filter: (((field_name)::text = 'f-spot'::text) OR
((field_name)::text = 'shutter'::text))
Total runtime: 1536.010 ms
(3 rows)
These are the representations of te values 'f-spot' and 'shutter' for
the field field_name in photo_info_data table:
xmltest=# select field_name, count(*) from user_info_data where
field_name in ('visina', 'spol') group by field_name;
field_name | count
------------+-------
'f-spot' | 3
'shutter' | 3
(2 rows)
Maybe my test-data is poor? As I've mentioned, photo_info_data has
little over 2300000 rows. And this is complete 'distribution' of the data:
xmltest=# select field_name, count(*) from user_info_data group by
field_name order by count(*) desc;
field_name | count
----------------+--------
field_Xx1 | 350000
field_Xx2 | 332447
field_Xx3 | 297414
field_Xx4 | 262394
field_Xx5 | 227396
field_Xx6 | 192547
field_Xx7 | 157612
field_Xx8 | 122543
field_Xx9 | 87442
field_Xx10 | 52296
field_1 | 50000
field_2 | 47389
field_3 | 42412
field_4 | 37390
field_5 | 32366
field_6 | 27238
field_7 | 22360
field_Xx11 | 17589
field_8 | 17412
field_9 | 12383
field_10 | 7386
field_11 | 2410
f-spot | 3
shutter | 3
focal | 3
flash | 3
m_city | 3
person | 3
iso | 2
(29 rows)
No matter what field_name value I enter in WHERE condition, planner
chooses sequential scan. Only when I add seperate index on field_name,
planner chooes index scan or bitmap index scan.
Mike
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance