On 4/22/24 12:51, Ron Johnson wrote:
On Mon, Apr 22, 2024 at 3:14 PM Adrian Klaver <adrian.klaver@xxxxxxxxxxx
<mailto:adrian.klaver@xxxxxxxxxxx>> wrote:
1) If they are already in enough of a PK order that the CLUSTER time vs
VACUUM FULL time would not be material as there is not much or any
sorting to do then what does the CLUSTER gain you?
Not much. Now they're just "slightly more ordered" instead of "slightly
less ordered" for little if any extra effort.
2) What evidence is there that the records where still in PK order just
because you deleted based on CREATED_ON? I understand the correlation
between CREATED_ON and the PK just not sure why that would necessarily
translate to an on disk order by PK?
1. Records are appended to tables in INSERT order, and INSERT order is
highly correlated to synthetic PK, by the nature of sequences.
Not something I would count on, see:
https://www.postgresql.org/docs/current/sql-createsequence.html
Notes
for how that may not always be the case.
Also any UPDATE or DELETE is going to change that. There is no guarantee
of order for the data in the table. If there where you would not need to
run CLUSTER.
2. My original email showed that CLUSTER took just as long as VACUUM
FULL. That means not many records had to be sorted, because... the
on-disk order was strongly correlated to PK and CREATED_ON. >
Will that happen *every time* in *every circumstance* in *every
database*? No, and I never said it would. But it does in *my *database
in *this *application.
Which gets us back to your comment upstream:
"What the VACUUM docs "should" do, it seems, is suggest CLUSTER on the
PK, if the PK is a sequence (whether that be an actual sequence, or a
timestamp or something else that grows monotonically)."
This is a case specific to you and this particular circumstance, not a
general rule for VACUUM FULL. If for no other reason then it might make
more sense for the application that the CLUSTER be done on some other
index then the PK.
--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx