Re: autovacuum locking question

MichaelDBA <MichaelDBA@xxxxxxxxxxx> · Fri, 6 Dec 2019 12:50:44 -0500

And Just to reiterate my own understanding 
of this...

autovacuum priority is less than a user-initiated request, so issuing a 
manual vacuum (user-initiated request) will not result in being 
cancelled.

Regards,

Michael Vitale

Jeff Janes wrote on 12/6/2019 12:47 PM:

  On Fri, Dec 6, 2019 at 10:55 AM Mike 
Schanne <mschanne@xxxxxxx>
 wrote:

The
 error is not actually showing up very often (I have 8 occurrences from 
11/29 and none since then).  So maybe I should not be concerned about 
it.  I suspect
 we have an I/O bottleneck from other logs (i.e. long checkpoint sync 
times), so this error may be a symptom rather than the cause.

I
 think that at the point it is getting cancelled, it has done all the 
work except the truncation of the empty pages, and reporting the results
 (for example, updating n_live_tup  and n_dead_tup).  If this happens 
every single time (neither last_autovacuum nor last_vacuum ever 
advances) it will eventually cause problems.  So this is mostly a 
symptom, but not entirely.  Simply running a manual vacuum should fix 
the reporting problem.  It is not subject to cancelling, so it will 
detect it is blocking someone and gracefully bow.  Meaning it will 
suspend the truncation, but will still report its results as normal.

Reading
 the table backwards in order to truncate it might be contributing to 
the IO problems as well as being a victim of those problems.  Upgrading 
to v10 might help with this, as it implemented a prefetch where it reads
 the table forward in 128kB chunks, and then jumps backwards one chunk 
at a time.  Rather than just reading backwards 8kB at a time.

Cheers,

Jeff