Help in avoiding a query 'Warm-Up' period/shared buffer cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

We have a web-application running against a postgres 8.1 database, and 
basically, every time I run a report after no other reports have been run for 
several hours, the report will take significantly longer (e.g. 30 seconds), 
then if I re-run the report again, or run the report when the web-application 
has been used recently (< 1 second).

Keep in mind that our web-app might issue 30 or more individual queries to 
return a given report and that this behavior is not just isolated to a single 
report-type - it basically happens for any of the reports after the web-app 
has been inactive.  Also, I can trace it back to the timing of the underlying 
queries, which show this same behavior (e.g. it's not because of overhead in 
our web-app).

So, it appears to be some sort of caching issue.  I'm not 100% clear on how 
the shared buffer cache works, and what we might do to make sure that we 
don't have these periods where queries take a long time.  Since our users' 
typical usage scenario is to not use the web-app for a long time and then 
come back and use it, if reports which generally take a second are taking 30 
seconds, we have a real problem.

I have isolated a single example of one such query which is very slow when no 
other queries have been run, and then speeds up significantly on the second 
run.

First run, after a night of inactivity:

explain analyze SELECT average_size, end_time FROM 1min_events WHERE file_id = 
'137271' AND end_time > now() - interval '2 minutes' ORDER BY end_time DESC 
LIMIT 1;

 Limit  (cost=47.06..47.06 rows=1 width=24) (actual time=313.585..313.585 
rows=1 loops=1)
   ->  Sort  (cost=47.06..47.06 rows=1 width=24) (actual time=313.584..313.584 
rows=1 loops=1)
         Sort Key: end_time
         ->  Bitmap Heap Scan on 1min_events  (cost=44.03..47.05 rows=1 
width=24) (actual time=313.562..313.568 rows=2 loops=1)
               Recheck Cond: ((end_time > (now() - '00:02:00'::interval)) AND 
(file_id = 137271))
               ->  BitmapAnd  (cost=44.03..44.03 rows=1 width=0) (actual 
time=313.551..313.551 rows=0 loops=1)
                     ->  Bitmap Index Scan on 1min_events_end_idx  
(cost=0.00..5.93 rows=551 width=0) (actual time=0.076..0.076 rows=46 loops=1)
                           Index Cond: (end_time > (now() - 
'00:02:00'::interval))
                     ->  Bitmap Index Scan on 1min_events_file_id_begin_idx  
(cost=0.00..37.85 rows=3670 width=0) (actual time=313.468..313.468 rows=11082 
loops=1)
                           Index Cond: (file_id = 137271)
 Total runtime: 313.643 ms
(11 rows)

Second run, after that:

 explain analyze SELECT average_size, end_time FROM 1min_events WHERE file_id 
= '137271' AND end_time > now() - interval '2 minutes' ORDER BY end_time DESC 
LIMIT 1;

 Limit  (cost=47.06..47.06 rows=1 width=24) (actual time=2.209..2.209 rows=1 
loops=1)
   ->  Sort  (cost=47.06..47.06 rows=1 width=24) (actual time=2.208..2.208 
rows=1 loops=1)
         Sort Key: end_time
         ->  Bitmap Heap Scan on 1min_events  (cost=44.03..47.05 rows=1 
width=24) (actual time=2.192..2.194 rows=2 loops=1)
               Recheck Cond: ((end_time > (now() - '00:02:00'::interval)) AND 
(file_id = 137271))
               ->  BitmapAnd  (cost=44.03..44.03 rows=1 width=0) (actual 
time=2.186..2.186 rows=0 loops=1)
                     ->  Bitmap Index Scan on 1min_events_end_idx  
(cost=0.00..5.93 rows=551 width=0) (actual time=0.076..0.076 rows=46 loops=1)
                           Index Cond: (end_time > (now() - 
'00:02:00'::interval))
                     ->  Bitmap Index Scan on 1min_events_file_id_begin_idx  
(cost=0.00..37.85 rows=3670 width=0) (actual time=2.106..2.106 rows=11082 
loops=1)
                           Index Cond: (file_id = 137271)
 Total runtime: 2.276 ms
(11 rows)

One of the things that is perplexing about the initial slowness of this query 
is that it's accessing the most recent rows in a given table (e.g. those in 
the last 2 minutes).  So, I would expect the OS cache to be updated with 
these new rows.

Some general information about the server / db:

1) The database is 25G, and has about 60 tables - some very small, but several 
> 5 MM rows.

2) The table I am querying against above (1min_events) has 5.5 MM rows, but is 
indexed on end_time, as well as a compound index on file_id, begin_time

3) The following are running on the server that holds the db:

a) A program which is reading files and making several (5-10) database calls 
per minute (these calls tend to take < 100 ms each).  These calls are 
inserting 10's of rows into several of the tables.

b) An apache web-server

c) The 8.1 postgres DB

d) we are running periodic CRON jobs (generally at 11pm, 1 am and 3am) that 
truncate some of the older data

e) we have autovacuum on with a 60 second naptime and and low scale factors 
0.2, so analyzes and vacuums happen throughout the day - vacuums are 
generally triggered by the truncate CRON jobs too.

4) Some of our config settings:

shared_buffers = 8192
work_mem = 8192 

Total RAM on server is 1 Gig


Basically any advice as to what to look at to avoid this situation would be 
greatly appreciated.  Is this simply a matter of tuning the shared_buffers 
parameter?  If not, is scheduling a set of queries to force the proper 
loading of the cache a logical solution?

Thanks in advance,


Mark



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux