Alvaro Herrera wrote:
Hint bits are used to mark tuples as created and/or deleted by
transactions that are know committed or aborted. To determine the
visibility of a tuple without such bits set, you need to consult pg_clog
and possibly pg_subtrans, so it is an expensive check. On the other
So, how come there is this outstanding work to do, which will inevitably
be done, and it
hasn't been done until it is 'just too late' to avoid getting in the way
of the query?
The OP didn't suggest that he had just loaded the data.
Also - is it the case that this only affects the case where updated
pages were spilled
during the transaction that changed them? ie, if we commit a
transaction and there
are changed rows still in the cache since their pages are not evicted
yet, are the hint
bits set immediately so that page is written just once? Seems this
would be common
in most OLTP systems.
Heikki points out that the list might get big and need to be abandoned,
but then you
fall back to scheduling a clog scan that can apply the bits, which does
what you have
now, though hopefully in a way that fills slack disk IO rather than
waiting for the
read.
Matthew says: 'it would be a list of changes since the last checkpoint'
but I don't
see why you can't start writing hints to in-memory pages as soon as the
transaction
ends. You might fall behind, but I doubt it with modern CPU speeds.
I can't see why Pavan's suggestion to try to update as many of the bits
as possible
when a dirty page is evicted would be contentious.
I do think this is something of interest to users, not just developers,
since it
may influence the way updates are processed where it is reasonable to do
so in 'bite sized chunks' as a multipart workflow.