Re: memory allocation ; postgresql-8.0

"Kavan, Dan (IMS)" <KavanD@xxxxxxxxxx> · Thu, 12 May 2005 11:10:00 -0400

Hi Scott,

Thanks again for all your tips.

If I knock the buffer size down to 65,536 (still higher than what you
are recommending)  then my shmmax becomes:
256,000 + 550,292,685 (65536*8396.8) + 1,454,100 = 552,002,785  

That will leave me with 3.5 GB of free memory for the system & work
memory to use.
Will those free system resources ever get used with a 10 million record,
10 GB database?

If I go with 65,536 as my buffer size, Would having the SHMMAX set to 1
GB on my sysctl.conf system parameters allow me to run two seperate
instances of postgresql on 2 seperate ports?

~DjK

-----Original Message-----
From: Scott Marlowe [mailto:smarlowe@xxxxxxxxxxxxxxxxx] 
Sent: Friday, May 06, 2005 3:31 PM
To: Kavan, Dan (IMS)
Subject: RE: [ADMIN] memory allocation ; postgresql-8.0

On Fri, 2005-05-06 at 14:12, Kavan, Dan (IMS) wrote:
> <Scott>Unless you routinely actually handle data sets that are 1.9 
> gigabytes in size, it's probably not a great idea. <Dj>I knew I 
> wouldn't be that easy. ;) We have a database going on here(in a couple

> weeks) that will handle 10
> million records.   I'm not sure how to measure if that means I will or
> won't have 1.9 GB in datasets.  Right now, most of my memory shows up 
> as free (top) and I have 4GB of swap, virtual memory, that hasn't been

> tapped, yet.  I put a copy of what I have in sysctl.conf and 
> postgresql.conf below ( the memory section )

Well, the only way you'll really know is when you have 10,000,000
records and start working with them.  You might want to create a test
data set and see how it runs on your setup with varying amounts of
buffers allocated to postgresql. I'd bet that in most circumstances,
you'll find 10,000 buffers, at most 20,000 buffers working best for you.

> <Scott>PostgreSQL doesn't really cache data in the classical sense.  
> All the data stored in its internal buffers is tossed away when the 
> last backend referencing said material stops referencing it. 
> <Dj>That's the reason why I leave the remaining 2GB of memory to the 
> system.

But, keep in mind, that reduces the amount of space the kernel now has
to play with, as well as the memory left for sorts (sort_mem in <8.0,
working_mem in 8.0).  It's all about tradeoffs.  1.9gig for pgsql is
usually a bit much, but not always.

> <Scott>Plus, there's a fair bit of overhead to managing such a large 
> data set, and, until 8.0, the algorithms for doing so with efficiency 
> weren't a part of PostgreSQL. <Dj>Besides full vacuuming and backups 
> what kind of maintenance is involved in managing a  database with 10 
> million records?

That's not the ovrehead I'm talking about. I mean the CPU overhead from
maintaining that large list of buffers and searching them each time you
access a buffer.  Again, 8.0 is better at it than <8.0

> <Scott>While some future version might incorporate genuine caching 
> that could utilize all that memory, for now, one is still better off 
> giving postgresql about 250 megabytes max and letting the kernel use 
> the rest for caching. <Dj>Does that logic scale to a 10 GB database 
> with 10 million records and 2 other small databases all with 10-20 
> simultaneous users doing queries?

It does for 7.4 and before, which really didn't handle large internal
buffers all that efficiently.  8.0 is an unknown to me in that area.

> 
> <Dj>
> #my startup script includes
> sysctl -w vm.overcommit_memory=2

Keep in mind that according to recent postings I've seen in the kernel
mailing list and the pgsql mailing lists, the vm.overcommit settings are
sometimes ignored, and the oom killer still stalks the night and kills
processes that are memory hogs.  IF postgresql is set to use 1.9 gig
shared memory, it's likely to be the first victim should the oom killer
come out to play.

> shared_buffers = 230000		# min 16, at least
max_connections*2,
> 8KB each was 65536
> work_mem = 1024		# min 64, size in KB

I'd increase our work mem to 16384 or so.  Depends on how many parallel
clients you're likely to have.   You want to allow sorts to happen in
memory when possible, but you don't want to starve the other processes
for memory or you'll get a swap storm.

> max_fsm_pages = 30000		# min max_fsm_relations*16, 6 bytes each
> max_fsm_relations = 1500	# min 100, ~50 bytes each

You'll likely want these larger too.  With a hard working large
database, 100,000 / 10000 are not unusual settings for the fsm settings.