On Fri, 14 Aug 2015 17:39:49 +0530 Deepak Balasubramanyam <deepak.balu@xxxxxxxxx> wrote: > > I have a table (20 million rows) in Postgresql 9.4 that contains a bigint > id as the primary key and another column that contains jsonb data. Queries > run on this table look like so... > > ------------ > ## Query > ------------ > select ... from table > WHERE table.column ->'item'->> 'name' = 'value' > ------------ > > I'd like to make an effort to get Postgresql to keep all data available in > this table and any index on this table in memory. This would ensure that > sequence or index scans made on the data are fairly fast. > > Research into this problem indicates that there is no reliable way to get > Postgresql to run off of RAM memory completely ( > http://stackoverflow.com/a/24235439/830964). Assuming the table and its > indexes amount to 15 gb of data on the disk and the machine contains 64GB > of RAM with shared buffers placed at anywhere from 16-24 GB, here are my > questions... > > 1. When postgresql returns data from this query, how can I tell how much of > the data was cached in memory? I'm not aware of any way to do that on a per-query basis. > 2. I'm aware that I can tweak the shared buffer so that more data is > cached. Is there a way to monitor this value for its effectiveness? Install the pg_buffercache extension and read up on what it provides. It gives a pretty good view into what PostgreSQL is keeping in memory. > 3. Is there a reliable way / calculation (or close to it), to determine a > point after which Postgresql will ask the disk for data Vs the caches? It will ask the disk for data if the data is not in memory. As long as the data it needs is in memory, it will never talk to the disk unless it needs to write data back. The cache is a cache. So there are only 2 reasons your data wouldn't all be in memory all the time: 1) It doesn't all fit 2) Some of that memory is needed by other tables/indexes/etc As far as when things get evicted from memory, you'll have to look at the source code, but it's your typical "keep the most commonly needed data in memory" algorithms. What problem are you seeing? What is your performance requirement, and what is the observed performance? I ask because it's unlikely that you really need to dig into these details like you are, and most people who ask questions like this are misguided in some way. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general