Search Postgresql Archives

Re: Waiting on ExclusiveLock on extension

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



To put the top question first:

How can table extension locks explain a a massive spike in CPU usage?

I can imagine 400 connections waiting on disk I/O, but then, wouldn't they all be sleeping?

> Ok, that's a MAJOR hint, because relation 1249 is a system catalog;
> namely pg_attribute. So I think what's happening here is that your
> catalog has become horrifically bloated. I'm 99% certain that VACUUM ALL
> will not vacuum the catalog tables.
>
> Do you by chance have autovacuum turned off?
>
> A manual VACUUM VERBOSE pg_attribute might provide some immediate relief.
>
Autovacuum is turned on. In addition, we do a manual VACUUM ALL at night. VACUUM VERBOSE pg_attribute ran in 0 seconds and processed a few hundred rows.

> Are you using a connection pool? Establishing 50 new database
> connections per second won't do anything to help performance...
>
As I understand it, a pool reduces network and CPU load. We have never seen any issues with those. So the extra monitoring and maintenance cost of a pool seems hard to justify.

> I think what that means is that there was suddenly a big spike in memory
> demand at the OS level, so now the OS is frantically dumping cached
> pages. That in itself won't explain this, but it may be a clue.
>
We monitor memory usage with Cacti. It's a dedicated server and nearly all memory is used as cache. If a script runs and demands memory, that becomes visible as cache is cleared out. There is no change in the amount of memory used as cache around the outage.

> In order to extend a relation we need to ask the filesystem to actually
> extend the file (which presumably means at least writing some metadata
> to disk), and then I think we create a WAL record. Creating the WAL
> record won't by itself write to disk... *unless* the wal_buffers are all
> already full.
>
I have a question here, we have "synchronous_commit = off". So when Postgres extends a page, would it do that just in memory, or does part of the extend operation require synchronous I/O?

> So if you also see an I/O spike when this happens you could well
> just be starved from the I/O system (though obviously it'd be
> better if we handled that situation more elegantly than this).

The SAR data shows no increase in pgpgin/s and pgpgout/s, which if I understand it correctly, means that there is no I/O spike. There is however an enormous increase in CPU usage.

> I do suspect your pgfree/s is very high though; putting 200k pages/s on
> the free list seems like something's broken.
>
The system has constant and considerable load of small writes. The pg_activity tool shows 300 IOPs sustained (it claims max IPs above 11000.) Postgres 9.3 had a comparable pgfree/s.

Would you know a good resource to get more knowledgeable about pgfree, pgpin, pgsteal?

Kind regards,
Andomar


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux