Couldn't have said it better myself; covered all the bases. If PG wants to become an industrial strength database, worthy of replacing DB2/etc., then these are the sorts of knobs and switches it will need. -- None of that is anything for amateurs to play with. Not jam a stick in anybody's eye, but shouldn't database pros not be amateurs? Or are most PG-ers coders who don't really want to design and tune a database? Robert ---- Original message ---- >Date: Tue, 12 Oct 2010 09:35:56 -0500 >From: pgsql-performance-owner@xxxxxxxxxxxxxx (on behalf of "Kevin Grittner" <Kevin.Grittner@xxxxxxxxxxxx>) >Subject: Re: How does PG know if data is in memory? >To: <pgsql-performance@xxxxxxxxxxxxxx>,<gnuoytr@xxxxxxx> > ><gnuoytr@xxxxxxx> wrote: > >> An approach that works can be found in DB2, and likely elsewhere. >> >> The key is that tablespaces/tables/indexes/buffers are all >> attached through the bufferpool (the DB2 term). A tablespace/ >> bufferpool match is defined. Then tables and indexes are assigned >> to the tablespace (and implicitly, the bufferpool). As a result, >> one can effectively pin data in memory. This is very useful, but >> not low hanging fruit to implement. > >This sounds similar to Sybase named caches. You can segment off >portions of the memory for specific caches, break that up into space >reserved for different I/O buffer sizes, and bind specific database >objects (tables and indexes) to specific caches. On the few >occasions where someone had failed to configure the named caches >when setting up a machine, it was caught almost immediately after >deployment because of end-user complaints about poor performance. >This was so critical to performance for us when we were using >Sybase, that one of my first reactions on finding it missing in >PostgreSQL was distress over the inability to tune as I had. > >When I posted to the list about it, the response was that LRU >eviction was superior to any tuning any human would do. I didn't >and don't believe that, but have found it's close enough in the >PostgreSQL environment to be *way* down my list of performance >issues. In fact, when looking at the marginal benefits it would >generate in PostgreSQL when done right, versus the number of people >who would shoot themselves in the foot with it, even I have come >around to feeling it's probably not a good idea. > >FWIW, the four main reasons for using it were: > >(1) Heavily used data could be kept fully cached in RAM and not >driven out by transient activity. > >(2) You could flag a cache used for (1) above as using "relaxed LRU >accounting" -- it saved a lot of time tracking repeated references, >leaving more CPU for other purposes. > >(3) Each named cache had its own separate set of locks, reducing >contention. > >(4) Large tables for which the heap was often were scanned in its >entirety or for a range on the clustered index could be put in a >relatively small cache with large I/O buffers. This avoided blowing >out the default cache space for situations which almost always >required disk I/O anyway. > >None of that is anything for amateurs to play with. You need to set >up caches like that based on evidence from monitoring and do careful >benchmarking of the results to actually achieve improvements over >LRU logic. > >> The introduction of rudimentary tablespaces is a first step. I >> assumed that the point was to get to a DB2-like structure at some >> point. Yes? > >As far as I can tell, there is nobody with that intent. > >-Kevin > >-- >Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) >To make changes to your subscription: >http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance