On Wed, 16 Jun 2021 at 12:09, Julien Rouhaud <rjuju123@xxxxxxxxx> wrote:
>
> On Wed, Jun 16, 2021 at 12:02:52PM +0530, Atul Kumar wrote:
> >
> > Sometimes I run a Postgres query it takes 30 seconds. Then, I
> > immediately run the same query and it takes 2 seconds. It appears that
> > Postgres has some sort of caching. Can I somehow see what that cache
> > is holding?
>
> You can use pgbuffercache for that:
> https://www.postgresql.org/docs/current/pgbuffercache.html
>
pgbuffercache is useful, but you also need to check the explain plan to help understand if it just cache issue or something else.
i mean for cases where change in buffers hit vs buffers read, then maybe cache is helping.
when you have time, you can find this very helpful.
pg internals
test=# create table t(id int, name text, primary key(id));
CREATE TABLE
test=# insert into t select x, md5(x::text) from generate_series(1, 1000000) x;
INSERT 0 1000000
test=# explain (analyze,buffers) select * from t where id < 10000;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=6394.67..19138.75 rows=352806 width=36) (actual time=0.433..1.750 rows=9999 loops=1)
Recheck Cond: (id < 10000)
Heap Blocks: exact=84
Buffers: shared hit=1 read=113 dirtied=84
-> Bitmap Index Scan on t_pkey (cost=0.00..6306.47 rows=352806 width=0) (actual time=0.408..0.408 rows=9999 loops=1)
Index Cond: (id < 10000)
Buffers: shared hit=1 read=29
Planning:
Buffers: shared hit=1 read=8
Planning Time: 0.226 ms
Execution Time: 2.233 ms
(11 rows)
test=# explain (analyze,buffers) select * from t where id < 10000;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=6394.67..19138.75 rows=352806 width=36) (actual time=0.419..1.447 rows=9999 loops=1)
Recheck Cond: (id < 10000)
Heap Blocks: exact=84
Buffers: shared hit=114
-> Bitmap Index Scan on t_pkey (cost=0.00..6306.47 rows=352806 width=0) (actual time=0.399..0.400 rows=9999 loops=1)
Index Cond: (id < 10000)
Buffers: shared hit=30
Planning Time: 0.145 ms
Execution Time: 1.921 ms
(9 rows)
and then check buffercache
test=# create table t(id int, name text, primary key(id));
CREATE TABLE
test=# insert into t select x, md5(x::text) from generate_series(1, 1000000) x;
INSERT 0 1000000
test=# explain (analyze,buffers) select * from t where id < 10000;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=6394.67..19138.75 rows=352806 width=36) (actual time=0.433..1.750 rows=9999 loops=1)
Recheck Cond: (id < 10000)
Heap Blocks: exact=84
Buffers: shared hit=1 read=113 dirtied=84
-> Bitmap Index Scan on t_pkey (cost=0.00..6306.47 rows=352806 width=0) (actual time=0.408..0.408 rows=9999 loops=1)
Index Cond: (id < 10000)
Buffers: shared hit=1 read=29
Planning:
Buffers: shared hit=1 read=8
Planning Time: 0.226 ms
Execution Time: 2.233 ms
(11 rows)
test=# explain (analyze,buffers) select * from t where id < 10000;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=6394.67..19138.75 rows=352806 width=36) (actual time=0.419..1.447 rows=9999 loops=1)
Recheck Cond: (id < 10000)
Heap Blocks: exact=84
Buffers: shared hit=114
-> Bitmap Index Scan on t_pkey (cost=0.00..6306.47 rows=352806 width=0) (actual time=0.399..0.400 rows=9999 loops=1)
Index Cond: (id < 10000)
Buffers: shared hit=30
Planning Time: 0.145 ms
Execution Time: 1.921 ms
(9 rows)
and then check buffercache
SELECT n.nspname, c.relname, count(*) AS buffers
FROM pg_buffercache b JOIN pg_class c
ON b.relfilenode = pg_relation_filenode(c.oid) AND
b.reldatabase IN (0, (SELECT oid FROM pg_database
WHERE datname = current_database()))
JOIN pg_namespace n ON n.oid = c.relnamespace
GROUP BY n.nspname, c.relname
ORDER BY 3 DESC
LIMIT 5;
nspname | relname | buffers
------------+---------------------------+---------
public | t | 3004
public | t_pkey | 816
pg_catalog | pg_proc | 20
pg_catalog | pg_depend_reference_index | 13
pg_catalog | pg_attribute | 13
> > Can I force all caches to be cleared for tuning purposes?
> > So I need to clear the cache without restarting/rebooting the postgres
> > server to check the correct execution plan of my query.
>
> No, cleaning postgres cache can only be done with a service restart. That
> being said, tuning shouldn't be done assuming that there's no cache. On the
> opposite maybe what you should do is use pg_prewarm
> (https://www.postgresql.org/docs/current/pgprewarm.html) to make sure that your
> cache isn't empty after a restart.
>
yes, flushing cache requires a restart. but you can also flush cache, by repeatedly querying huge tables which are much larger than your buffers and not relevant to your query,
that way the cache contents would be replaced by new content.
also if you can play around by logging onto your machine (do not do in production, it will flush all cache which may be in use elsewhere) as root.
stop postgresql
sync
echo 3 > /proc/sys/vm/drop_caches
start postgresql
sync
echo 3 > /proc/sys/vm/drop_caches
start postgresql
you can try by keeping the shared_buffers/effective_cache_size value low, to ensure not a lot of pages are pulled up into the buffers.
Thanks,
Vijay
Mumbai, India