On Thu, Nov 15, 2012 at 6:07 PM, Jeff Janes <jeff.janes@xxxxxxxxx> wrote: > On Thu, Nov 15, 2012 at 2:44 PM, Merlin Moncure <mmoncure@xxxxxxxxx> wrote: > >>>> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) >>>> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) >>>> select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) >>>> select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout) >>>> select(0, NULL, NULL, NULL, {0, 3000}) = 0 (Timeout) >>>> select(0, NULL, NULL, NULL, {0, 4000}) = 0 (Timeout) >>>> select(0, NULL, NULL, NULL, {0, 6000}) = 0 (Timeout) >>>> select(0, NULL, NULL, NULL, {0, 7000}) = 0 (Timeout) >>>> select(0, NULL, NULL, NULL, {0, 8000}) = 0 (Timeout) >>>> select(0, NULL, NULL, NULL, {0, 9000}) = 0 (Timeout) > > This is not entirely inconsistent with the spinlock. Note that 1000 > is repeated 3 times, and 5000 is missing. > > This might just be a misleading random sample we got here. I've seen > similar close spacing in some simulations I've run. > > It is not clear to me why we use a resolution of 1 msec here. If the > OS's implementation of select() eventually rounds to the nearest msec, > that is its business. But why do we have to lose intermediate > precision due to its decision? Yeah -- you're right, this is definitely spinlock issue. Next steps: *) in mostly read workloads, we have a couple of known frequent offenders. In particular the 'BufFreelistLock'. One way we can influence that guy is to try and significantly lower/raise shared buffers. So this is one thing to try. *) failing that, LWLOCK_STATS macro can be compiled in to give us some information about the particular lock(s) we're binding on. Hopefully it's a lwlock -- this will make diagnosing the problem easier. *) if we're not blocking on lwlock, it's possibly a buffer pin related issue? I've seen this before, for example on an index scan that is dependent on an seq scan. This long thread: "http://postgresql.1045698.n5.nabble.com/9-2beta1-parallel-queries-ReleasePredicateLocks-CheckForSerializableConflictIn-in-the-oprofile-td5709812i100.html" has a lot information about that case and deserves a review. *) we can consider experimenting with futex (http://archives.postgresql.org/pgsql-hackers/2012-06/msg01588.php) to see if things improve. This is dangerous, and could crash your server/eat your data, so fair warning. merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general