Re: Forcing seq_scan off for large table joined with tiny table yeilds improved performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Scott Marlowe wrote:

It's not really solved, it's just a happy coincidence that the current
plan runs well.  In order to keep the query planner making good
choices you need to increase stats target for the field in the index
above.  The easiest way to do so is to do this:

alter database mydb set default_statistics_target=100;

and run analyze again:

analyze;

So, i removed the index on field_name, set default_default_statistics_target to 100, analyzed, and the results are the same:

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=1.04..58676.31 rows=218048 width=68) (actual time=0.067..12268.394 rows=6 loops=1)
   Hash Cond: ((u.field_name)::text = (t.key)::text)
-> Seq Scan on photo_info_data u (cost=0.00..47500.30 rows=2398530 width=50) (actual time=0.013..6426.611 rows=2398446 loops=1) -> Hash (cost=1.02..1.02 rows=2 width=18) (actual time=0.015..0.015 rows=2 loops=1) -> Seq Scan on t_query_data t (cost=0.00..1.02 rows=2 width=18) (actual time=0.002..0.006 rows=2 loops=1)
 Total runtime: 12268.459 ms
(6 rows)

I even changed default_statistics_target to 1000:

------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=1.04..58580.29 rows=208561 width=67) (actual time=0.054..12434.283 rows=6 loops=1)
   Hash Cond: ((u.field_name)::text = (t.key)::text)
-> Seq Scan on photo_info_data u (cost=0.00..47499.46 rows=2398446 width=49) (actual time=0.012..6129.923 rows=2398446 loops=1) -> Hash (cost=1.02..1.02 rows=2 width=18) (actual time=0.015..0.015 rows=2 loops=1) -> Seq Scan on t_query_data t (cost=0.00..1.02 rows=2 width=18) (actual time=0.002..0.004 rows=2 loops=1)
 Total runtime: 12434.338 ms
(6 rows)


Even when I run this query, I get sequential scan:

explain analyze select * from photo_info_data where field_name = 'f-spot' or field_name = 'shutter';

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Seq Scan on photo_info_data (cost=0.00..59491.69 rows=1705 width=49) (actual time=0.018..1535.963 rows=6 loops=1) Filter: (((field_name)::text = 'f-spot'::text) OR ((field_name)::text = 'shutter'::text))
 Total runtime: 1536.010 ms
(3 rows)

These are the representations of te values 'f-spot' and 'shutter' for the field field_name in photo_info_data table:

xmltest=# select field_name, count(*) from user_info_data where field_name in ('visina', 'spol') group by field_name;
 field_name | count
------------+-------
 'f-spot'   |     3
 'shutter'  |     3
(2 rows)


Maybe my test-data is poor? As I've mentioned, photo_info_data has little over 2300000 rows. And this is complete 'distribution' of the data:

xmltest=# select field_name, count(*) from user_info_data group by field_name order by count(*) desc;
   field_name   | count
----------------+--------
 field_Xx1      | 350000
 field_Xx2      | 332447
 field_Xx3      | 297414
 field_Xx4      | 262394
 field_Xx5      | 227396
 field_Xx6      | 192547
 field_Xx7      | 157612
 field_Xx8      | 122543
 field_Xx9      |  87442
 field_Xx10     |  52296
 field_1        |  50000
 field_2        |  47389
 field_3        |  42412
 field_4        |  37390
 field_5        |  32366
 field_6        |  27238
 field_7        |  22360
 field_Xx11     |  17589
 field_8        |  17412
 field_9        |  12383
 field_10       |   7386
 field_11       |   2410
 f-spot         |      3
 shutter        |      3
 focal          |      3
 flash          |      3
 m_city         |      3
 person         |      3
 iso            |      2
(29 rows)

No matter what field_name value I enter in WHERE condition, planner chooses sequential scan. Only when I add seperate index on field_name, planner chooes index scan or bitmap index scan.

	Mike

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux