Re: Where to find information on the new HOT tables?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Le Tuesday 05 February 2008 21:28:31 Alvaro Herrera, vous avez écrit :
> Perhaps there is a user-level document somewhere.

I don't think such a document already exists, even rtfm_please doesn't know 
about any as of now. So I'll try to begin something here, and depending on 
the comments I'll publish a short user oriented HOT introduction article.
Here we go...

PostgreSQL implements HOT (Heap Only Tuples), a way for the server to limit 
the work it has to make when updating tuples. That's what we call an 
optimization :)

PostgreSQL MVCC implementation choice means that updating a tuple create a 
entire new version of it and mark the old one as no longer valid (as of the 
updating transaction id). Then VACUUM will have to clean out the old 
reference as soon as possible.
Let's not forget that the indexes pointing the the old tuples need to point to 
the new version of it as of transaction id. PostgreSQL currently does not 
save visibility information into the index, though, reducing the janitoring 
here. But still, for the index, the operation of updating a tuple is 
equivalent to a delete and an insert.
That's before HOT.

Starting with PostgreSQL 8.3, when a tuple is updated and if the update only 
concerns non-indexed columns, the RDBMS is smart enough for the existing 
indexes not to need any update at all.

This is done by creating a new tuple if possible on the same page as the old 
one, and maintaining a chain of updated tuples linking a new version to the 
old one. An HOT tuple is in fact one that can't be reached from any index. 
VACUUM will now only have to prune the tuple versions of the chain that are 
no more visible, and as no index were updated (there was no need to), there's 
no VACUUM work to get done on the indexes.
Of course, for HOT to work properly, PostgreSQL has now to follow each HOT 
chain when SELECT'ing tuples and using an index, but the same amount of 
tuples version was to be read before HOT too. The difference is that with HOT 
the new versions of the HOT-updated tuples are no more reachable via the 
index directly, so PostgreSQL has to follow the chain when reading the heap.

Please comment and correct me if my understanding is wrong (which wouldn't be 
a surprise), if this article over simplified, or not really written in 
English :)
I'd like to publish a correct version of this for us to point asking users to, 
or maybe it could even end up as official documentation/FAQ material?

Regards,
-- 
dim

Attachment: signature.asc
Description: This is a digitally signed message part.


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux