Re: Proposal of tunable fix for scalability of 8.4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 2009-03-14 at 12:09 -0400, Tom Lane wrote:
> Heikki Linnakangas <heikki.linnakangas@xxxxxxxxxxxxxxxx> writes:
> > WALInsertLock is also quite high on Jignesh's list. That I've seen 
> > become the bottleneck on other tests too.
> 
> Yeah, that's been seen to be an issue before.  I had the germ of an idea
> about how to fix that:
> 
> 	... with no lock, determine size of WAL record ...
> 	obtain WALInsertLock
> 	identify WAL start address of my record, advance insert pointer
> 		past record end
> 	*release* WALInsertLock
> 	without lock, copy record into the space just reserved
> 
> The idea here is to allow parallelization of the copying of data into
> the buffers.  The hold time on WALInsertLock would be very short.  Maybe
> it could even become a spinlock, though I'm not sure, because the
> "advance insert pointer" bit is more complicated than it looks (you have
> to allow for the extra overhead when crossing a WAL page boundary).
> 
> Now the fly in the ointment is that there would need to be some way to
> ensure that we didn't write data out to disk until it was valid; in
> particular how do we implement a request to flush WAL up to a particular
> LSN value, when maybe some of the records before that haven't been fully
> transferred into the buffers yet?  The best idea I've thought of so far
> is shared/exclusive locks on the individual WAL buffer pages, with the
> rather unusual behavior that writers of the page would take shared lock
> and only the reader (he who has to dump to disk) would take exclusive
> lock.  But maybe there's a better way.  Currently I don't believe that
> dumping a WAL buffer (WALWriteLock) blocks insertion of new WAL data,
> and it would be nice to preserve that property.

Yeh, that's just what we'd discussed previously:
http://markmail.org/message/gectqy3yzvjs2hru#query:Reworking%20WAL%
20locking+page:1+mid:gectqy3yzvjs2hru+state:results

Are you thinking of doing this for 8.4? :-)

-- 
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux