Search Postgresql Archives

Re: High SYS CPU - need advise

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 15, 2012 at 4:29 PM, Alvaro Herrera
<alvherre@xxxxxxxxxxxxxxx> wrote:
> Merlin Moncure escribió:
>
>> ok, excellent.   reviewing the log, this immediately caught my eye:
>>
>> recvfrom(8, "\27\3\1\0@", 5, 0, NULL, NULL) = 5
>> recvfrom(8, "\327\327\nl\231LD\211\346\243@WW\254\244\363C\326\247\341\177\255\263~\327HDv-\3466\353"...,
>> 64, 0, NULL, NULL) = 64
>> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 3000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 4000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 6000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 7000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 8000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 9000})  = 0 (Timeout)
>> semop(41713721, {{2, 1, 0}}, 1)         = 0
>> lseek(295, 0, SEEK_END)                 = 0
>> lseek(296, 0, SEEK_END)                 = 8192
>>
>> this is definitely pointing to spinlock issue.
>
> I met Rik van Riel (Linux kernel hacker) recently and we chatted about
> this briefly.  He strongly suggested that we should consider using
> futexes on Linux instead of spinlocks; the big advantage being that
> futexes sleep instead of spinning when contention is high.  That would
> reduce the system load in this scenario.

Well, so do postgres spinlocks right?  When we overflow
spins_per_delay we go to pg_usleep which proxies to select() --
postgres spinlocks are a hybrid implementation.  Moving to futex is
possible improvement (that's another discussion) in degenerate cases
but I'm not sure that I've exactly zeroed in on the problem.  Or am I
missing something?

What I've been scratching my head over is what code exactly would
cause an iterative sleep like the above.  The code is here:

  pg_usleep(cur_delay * 1000L);

  /* increase delay by a random fraction between 1X and 2X */
  cur_delay += (int) (cur_delay *
        ((double) random() / (double) MAX_RANDOM_VALUE) + 0.5);
  /* wrap back to minimum delay when max is exceeded */
  if (cur_delay > MAX_DELAY_MSEC)
    cur_delay = MIN_DELAY_MSEC;

...so cur_delay is supposed to increase in non linear fashion.  I've
looked around the sleep, usleep, and latch calls as of yet haven't
found anything that advances timeout just like that (yet, need to do
another pass). And we don't know for sure if this is directly related
to OP's problem.

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux