> > > > I dunno > > > > > > > > > if you've got anything gdb-equivalent under Windows, > but that's > > > > > the first thing I'd be interested in ... > > > > > > > > Here ya go: > > > > > > > > http://www.devisser-siderius.com/stack1.jpg > > > > http://www.devisser-siderius.com/stack2.jpg > > > > http://www.devisser-siderius.com/stack3.jpg > > > > > > > > There are three threads in the process. I guess thread 1 > > > > (stack1.jpg) is the most interesting. > > > > > > > > I also noted that cranking up concurrency in my app > reproduces the > > > > problem in about 4 minutes ;-) > > > > Just reproduced again. > > > > > Actually, stack2 looks very interesting. Does it "stay stuck" in > > > pg_queue_signal? That's really not supposed to happen. > > > > Yes it does. > > An update on that: There is actually *two* processes in this > state, both hanging in pg_queue_signal. I've looked at the > source of that, and the obvious candidate for hanging is > EnterCriticalSection. I also found this: > > http://blogs.msdn.com/larryosterman/archive/2005/03/02/383685.aspx > > where they say: > > " > In addition, for Windows 2003, SP1, the EnterCriticalSection > API has a subtle change that's intended tor resolve many of > the lock convoy issues. Before > Win2003 SP1, if 10 threads were blocked on > EnterCriticalSection and all 10 threads had the same > priority, then EnterCriticalSection would service those > threads in a FIFO (first -in, first-out) basis. Starting in > Windows 2003 SP1, the EnterCriticalSection will wake up a > random thread from the waiting threads. If all the threads > are doing the same thing (like a thread pool) this won't make > much of a difference, but if the different threads are doing > different work (like the critical section protecting a widely > accessed object), this will go a long way towards removing > lock convoy semantics. > " > > Could it be they broke it when they did that???? In theory, yes, but it still seems a bit far fetched :-( If you have the env to rebuild, can you try changing the order of the lines: ResetEvent(pgwin32_signal_event); LeaveCriticalSection(&pg_signal_crit_sec); in backend/port/win32/signal.c And if not, can you also try disabling the stats collector and see if that makes a difference. (Could be a workaround..) //Magnus