Re: Re: significant performance hit whenever autovacuum runs after upgrading from 9.0 -> 9.1

Lonni J Friedman <netllama@xxxxxxxxx> · Wed, 23 May 2012 09:12:47 -0700

Thanks for your reply.

On Tue, May 22, 2012 at 7:19 PM, Andy Colson <andy@xxxxxxxxxxxxxxx> wrote:
>  On Mon, May 21, 2012 at 2:05 PM, Lonni J Friedman<netllama@xxxxxxxxx>
>  wrote:
>>>
>>> Greetings,
>>>
>>> When I got in this morning, I found
>>> an autovacuum process that had been running since just before the load
>>> spiked,
>
>
> Autovacuum might need to set the freeze bit very first time it runs.  I
> recall hearing advice about running a 'vacuum freeze' after you insert a
> huge amount of data.  And I recall pg_upgrade doesn't write stats, so did
> you analyze your database?

yes, I ran a 'vacuum analyze' for all databases & tables immediately
following completion of pg_upgrade.

>
> Or, maybe its not vacuum... maybe some of your sql statements are planning
> differently and running really bad.  Can you check some?  Can you log slow
> queries?
>
> Have you checked the status of your raid?  Maybe you lost a drive and its in
> recovery and you have very slow IO?

I checked that initially, but the array is fine.

After banging my head on the wall for  a long time, I happened to
notice that khugepaged was consuming 100% CPU every time autovacuum
was running.  I did:
echo "madvise" > /sys/kernel/mm/transparent_hugepage/defrag

and immediately the entire problem went away.  Load dropped within
minutes from 35.00 to 1.00, and has remained under 4.00 for the past
18 hours.  Prior to disabling defrag, I never saw the load below 10.00
for more than a few seconds at a time.

So this looks like a nasty Fedora16 kernel bug to me, or maybe
postgresql & Fedora16's default kernel settings are just not
compatible?

Is anyone else using Fedora16 & PostgreSQL-9.1 ?

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general