Two strange behaviors I found:
1) Even I restart the machine and restart the PostgreSQL, then I execute the query, i still see the shared_hit. It seems when start PG, i will automatically load the data in the cache of the last time?
2) After I rerun the query, the time for the smaller data is about 19ms, while the time for the bigger data is about 17ms. And the trend is the time for bigger data is always faster than the smaller data for about 1 to 2 ms
Any suggestions? thanks very much.
On Thu, Feb 28, 2013 at 11:19 PM, Julien Cigar <jcigar@xxxxxxxxx> wrote:
Probably that somes pages have to be loaded in memory ...On 02/28/2013 16:11, Ao Jianwang wrote:
Hi,
Does any one can tell me why the same query runs against on smaller data is slower than bigger table. thanks very much.
I am using PostgreSQL9.1.8.
t_apps_1 and t_estimate_1 are about 300M respectively, while *_list_1 about 10M more or less. According to the result, it need to read a lot of blocks(112) from disk.explain (ANALYZE ON, BUFFERS ON, verbose on) SELECT e.t_id, SUM(e.estimate) as estFROMt_estimate_list_1 l,t_apps_list_1 rl,t_apps_1 r,t_estimate_1 eWHEREl.id = rl.dsf_id andl.date = '2012-07-01' andl.fed_id = 202 andl.st_id = 143464 andrl.cat_id = 12201 andl.id = e.list_id andrl.id = r.list_id andr.t_id = e.t_idGROUP BY e.t_id;----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------HashAggregate (cost=2529.91..2530.06 rows=15 width=8) (actual time=1041.391..1041.409 rows=97 loops=1)Buffers: shared hit=304 read=112-> Nested Loop (cost=0.00..2529.84 rows=15 width=8) (actual time=96.752..1041.145 rows=97 loops=1)Buffers: shared hit=304 read=112-> Nested Loop (cost=0.00..312.60 rows=242 width=12) (actual time=62.035..70.239 rows=97 loops=1)Buffers: shared hit=18 read=10-> Nested Loop (cost=0.00..16.56 rows=1 width=12) (actual time=19.520..19.521 rows=1 loops=1)Buffers: shared hit=3 read=6-> Index Scan using t_estimate_list_1_unique on t_estimate_list_1 l (cost=0.00..8.27 rows=1 width=4) (actual time=11.175..11.176 rows=1 loops=1)Index Cond: ((date = '2012-07-01'::date) AND (st_id = 143464) AND (fed_id = 202))Buffers: shared hit=2 read=4-> Index Scan using t_apps_list_1_unique on t_apps_list_1 rl (cost=0.00..8.28 rows=1 width=8) (actual time=8.339..8.339 rows=1 loops=1)Index Cond: ((dsf_id = l.id) AND (cat_id = 12201))Buffers: shared hit=1 read=2-> Index Scan using t_apps_1_pkey on t_apps_1 r (cost=0.00..288.56 rows=598 width=8) (actual time=42.513..50.676 rows=97 loops=1)Index Cond: (list_id = rl.id)Buffers: shared hit=15 read=4-> Index Scan using t_estimate_1_pkey on t_estimate_1 e (cost=0.00..9.15 rows=1 width=12) (actual time=10.006..10.007 rows=1 loops=97)Index Cond: ((list_id = l.id) AND (t_id = r.t_id))Buffers: shared hit=286 read=102Total runtime: 1041.511 ms(21 rows)
The table *_30 are about 30 times larger than *_1 in the above SQL. According to the result, it need to read a lot of blocks(22) from disk.explain (ANALYZE ON, BUFFERS ON) SELECT e.t_id, SUM(e.estimate) as estFROMt_estimate_list_30 l,t_apps_list_30 rl,t_apps_30 r,t_estimate_30 eWHEREl.id = rl.dsf_id andl.date = '2012-07-01' andl.fed_id = 202 andl.st_id = 143464 andrl.cat_id = 12201 andl.id = e.list_id andrl.id = r.list_id andr.t_id = e.t_idGROUP BY e.t_id;QUERY PLAN----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------HashAggregate (cost=3494.89..3495.04 rows=15 width=8) (actual time=160.612..160.632 rows=97 loops=1)Buffers: shared hit=493 read=22-> Nested Loop (cost=0.00..3494.81 rows=15 width=8) (actual time=151.183..160.533 rows=97 loops=1)Buffers: shared hit=493 read=22-> Nested Loop (cost=0.00..431.42 rows=240 width=12) (actual time=105.810..106.597 rows=97 loops=1)Buffers: shared hit=20 read=10-> Nested Loop (cost=0.00..16.58 rows=1 width=12) (actual time=52.804..52.805 rows=1 loops=1)Buffers: shared hit=4 read=6-> Index Scan using t_estimate_list_5_unique on t_estimate_list_5 l (cost=0.00..8.27 rows=1 width=4) (actual time=19.846..19.846 rows=1 loops=1)Index Cond: ((date = '2012-07-01'::date) AND (st_id = 143464) AND (fed_id = 202))Buffers: shared hit=2 read=4-> Index Scan using t_apps_list_5_unique on t_apps_list_5 rl (cost=0.00..8.30 rows=1 width=8) (actual time=32.951..32.952 rows=1 loops=1)Index Cond: ((dsf_id = l.id) AND (cat_id = 12201))Buffers: shared hit=2 read=2-> Index Scan using t_apps_5_pkey on t_apps_5 r (cost=0.00..393.68 rows=1693 width=8) (actual time=53.004..53.755 rows=97 loops=1)Index Cond: (list_id = rl.id)Buffers: shared hit=16 read=4-> Index Scan using t_estimate_5_pkey on t_estimate_5 e (cost=0.00..12.75 rows=1 width=12) (actual time=0.555..0.555 rows=1 loops=97)Index Cond: ((list_id = l.id) AND (t_id = r.t_id))Buffers: shared hit=473 read=12Total runtime: 160.729 ms(21 rows)
It should be faster if you re-run the same query just after
-- No trees were killed in the creation of this message. However, many electrons were terribly inconvenienced.