Re: 9.2.2 - semop hanging

Kevin Grittner <kgrittn@xxxxxxxxx> · Mon, 15 Jul 2013 12:12:01 -0700 (PDT)

Rafael Domiciano <rafael.domiciano@xxxxxxxxx> wrote:

> PostgreSQL 9.2.2 on x86_64-unknown-linux-gnu, compiled by gcc
> (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit

> CentOS release 6.3 (Final)

> Since 2 weeks I'm get stucked in a very strange situation: from
> time to time (sometimes with intervals less than 10 minutes), the
> server get "stucked"/"hang" (I dont know how to call it) and
> every connections on postgres (dont matter if it's SELECT,
> UPDATE, DELETE, INSERT, startup, authentication...) seems like
> get "paused"; after some seconds (say ~10 or ~15 sec, sometimes
> less) everything "goes OK".

During these episodes, do you see high system CPU time?  If so, try
disabling transparent huge page support, and see whether it affects
the frequency or severity of the episodes.

> So, my first trial was to check disks. Running "iostat"
> apparently showed that disks was OK.

Did you run iostat during an episode of slowness?  What did it
show?  Giving an interpretation that it as "apparently OK" doesn't
provide much useful information.

> It's a Raid10, 4 600GB SAS, IBM Storage DS3512, over FC. IBM DS
> Storage Manager says that disks is OK.

Are there any reports to show you when writing was saturated?

>              total       used       free     shared    buffers    cached
> Mem:        145182     130977      14204          0         43    121407
> -/+ buffers/cache:       9526     135655
> Swap:         6143         65       6078

> Following is what I've tried:
> 1) Emre Hasegeli has suggested to reduce my shared buffers, but
> it's already low:
>   total server memory: 141 GB
>   shared_buffers: 16 GB

On a machine with nearly twice that RAM, I've had to decrease
shared_buffers to 2GB to avoid the symptoms you describe.  That is
in conjunction with making the background writer more aggressive
and making sure the checkpoint completion target is set to 0.9.

> Maybe it's too low? I've been thinking to increase to 32 GB.

Well, you could try that; if the symptoms get worse, then you might
be willing to go the other direction....

> max_connections = 500 and ~400 connections average

How many cores (not "hardware threads") does the machine have?  You
will probably have better throughput and latency if you use
connection pooling to limit the number of active database
transactions to somewhere arount two times the number of cores, or
slightly above that.

-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin