On Thu, 15 May 2008, Luke Lonergan wrote:
BTW we¹ve removed HINT bit checking in Greenplum DB and improved the
visibility caching which was enough to provide performance at the same level
as with the HINT bit optimization, but avoids this whole ³write the data,
write it to the log also, then write it again just for good measure²
behavior.
This sounds like a good option. I believe I suggested this a few months
ago, however it was rejected because in the worst case (when the hints are
not cached), if you're doing an index scan, you can do twice the number of
seeks as before.
http://archives.postgresql.org/pgsql-performance/2007-12/msg00217.php
The hint data will be four bits per tuple plus overheads, so it could be
made very compact, and therefore likely to stay in the cache fairly well.
Each tuple fetched would have to be spaced really far apart in the
database table in order to exhibit the worst case, because fetching a page
of hint cache will cause 64kB or so of disc to appear in the disc's
read-ahead buffer, which will be equivalent to 128MB worth of database
table (assuming eight tuples per block and no overhead). As soon as you
access another tuple in the same 128MB bracket, you'll hit the disc
read-ahead buffer for the hints.
On balance, to me it still seems like a good option.
Matthew
--
Those who do not understand Unix are condemned to reinvent it, poorly.
-- Henry Spencer