explanation for seeks in VACUUM

Jeff Davis <pgsql@xxxxxxxxxxx> · Fri, 14 Dec 2007 11:29:54 -0800

"bigtable" has about 60M records, about 2M of which are dead at the time
of VACUUM. Shared_buffers are about 1GB, and the machine has 4GB of
memory.

If I run a "SELECT COUNT(*) FROM bigtable", and I ktrace that (FreeBSD)
for 10 seconds, I see only a handful of lseek calls (33), which is no
surprise since I am asking for sequential I/O. I assume those lseeks are
just to skip over pages that already happen to be in shared_buffers.

However, If I have several indexes on that table, and I run a VACUUM, I
observe a lot of seeking. In a 10 second interval, I saw about 5000
lseek calls in the ktrace to the same file descriptor (which is an
index). That's about one every 2ms, so I'm sure a large portion of the
file must have been in the OS buffer cache.

I just don't quite understand what's causing the lseeks.

My understanding is that vacuum uses maintenance_work_mem to hold the
list of dead tuples. In my case that's 2M row versions, times about 6
bytes per entry (in the list of dead tuples) equals about 12MB, which is
much less than 128MB maintenance_work_mem. So it doesn't appear that
maintenance_work_mem is too small.

Even if maintenance_work_mem was the limiting factor, wouldn't the
VACUUM still be operating mostly sequentially, even if it takes multiple
passes?

The only seeking that it seems like VACUUM would need to do in an index
file is when an index page completely empties out, but that wouldn't
account for 5000 lseeks in 10 seconds, would it? 

Where am I going wrong? Are many of these lseeks no-ops or something?

Regards,
	Jeff Davis

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend