Search Postgresql Archives

Re: CLUSTER vs. VACUUM FULL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/22/24 12:51, Ron Johnson wrote:
On Mon, Apr 22, 2024 at 3:14 PM Adrian Klaver <adrian.klaver@xxxxxxxxxxx <mailto:adrian.klaver@xxxxxxxxxxx>> wrote:





    1) If they are already in enough of a PK order that the CLUSTER time vs
    VACUUM FULL time would not be material as there is not much or any
sorting to do then what does the CLUSTER gain you?

Not much.  Now they're just "slightly more ordered" instead of "slightly less ordered" for little if any extra effort.

    2) What evidence is there that the records where still in PK order just
    because you deleted based on CREATED_ON? I understand the correlation
    between CREATED_ON and the PK just not sure why that would necessarily
    translate to an on disk order by PK?


1. Records are appended to tables in INSERT order, and INSERT order is highly correlated to synthetic PK, by the nature of sequences.

Not something I would count on, see:

https://www.postgresql.org/docs/current/sql-createsequence.html

Notes

for how that may not always be the case.

Also any UPDATE or DELETE is going to change that. There is no guarantee of order for the data in the table. If there where you would not need to run CLUSTER.

2. My original email showed that CLUSTER took just as long as VACUUM FULL.  That means not many records had to be sorted, because... the on-disk order was strongly correlated to PK and CREATED_ON. > Will that happen *every time* in *every circumstance* in *every database*?  No, and I never said it would.  But it does in *my *database in *this *application.


Which gets us back to your comment upstream:

"What the VACUUM docs "should" do, it seems, is suggest CLUSTER on the PK, if the PK is a sequence (whether that be an actual sequence, or a timestamp or something else that grows monotonically)."

This is a case specific to you and this particular circumstance, not a general rule for VACUUM FULL. If for no other reason then it might make more sense for the application that the CLUSTER be done on some other index then the PK.



--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx






[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux