Re: Idle processes chewing up CPU?

Craig Ringer <craig@xxxxxxxxxxxxxxxxxxxxx> · Wed, 19 Aug 2009 15:06:14 +0800

On 19/08/2009 1:34 PM, Brendan Hill wrote:
Hi Craig, thanks for the analysis. If I attach a debugger on the runaway
child process, will this halt execution for all the other child processes
(ie. freeze the server)? And, can I attach Visual Studio C++ 2008, or is
there a recommended debugger for Windows debugging?

Visual C++ 2008's debugger should be fine - and it's certainly a lot 
nicer to use than windbg.exe . LOTS nicer. I'm surprised you have VS 
2008 on your production server, though - or are you planning on using 
remote debugging?

Anyway: If you attach to a given backend, execution of the other 
backends won't freeze. If you promptly unpause execution of the backend 
you attached to everything will run normally. You might not want to 
interrupt the backend's execution for too long at a time though, as my 
understanding is that Pg does have tasks that require synchronization 
across all backends and leaving one in a state of paused execution for 
too long might slow things down.

I did some quick testing before posting. First, I downloaded and 
unpacked the 8.4.0 sources since that's what I'm running on my 
workstation. I then establishined two sessions to an otherwise idle 8.4 
DB on WinXP, then attaching VS 2008 EE's debugger to one of them:

  Tools -> Attach to Process, check "show processes from all users",
  select the target postgres.exe by pid, attach.

It took a while for VS to load symbols for the first time, but the other 
backend was responsive during that time. When VS finished loading 
symbols it auto-resumed execution of the backend.

When I pause execution the other backend remains responsive.  I can 
still establish new connections too.

With execution running normally I added a breakpoint at pq_recvbuf:

  Debug -> New Breakpoint -> Break at Function (CTRL-B),
  "pq_recvbuf", line 1 char 1 language "C", OK

then issued a query to the backend I was debugging. It processed the 
query and then execution stopped at the breakpoint. I was prompted to 
locate the source file I'd broken in, and when I did so it showed an 
execution marker at the appropriate point, I could step execution 
through the sources, etc.

When I was done, I just detached from the process with Tools -> Detach 
All, leaving it running as before.

In your position I'd start by waiting until you have an out-of-control 
backend, attaching to it without pausing it, and setting a breakpoint at 
my_sock_read. If the breakpoint is hit then something's called 
my_sock_read again; it won't trigger if my_sock_read is somewhere on the 
call stack, only when the current point of execution enters the 
function. You can step through execution from there see where it's looping.

If you find that my_sock_read isn't being called repeatedly, then the 
infinite loop is in my_sock_read or something it's calling. Break into 
execution and step through to see what Pg is doing.

Given the reliability of the server in the past, I'd probably be expecting
an issue with OpenSSL instead, but with debugging attached I should be able
to say for sure.

Yep. If, for example, you waited until a backend was in the problem 
state where it was using 100% CPU, attached the debugger, and set a 
breakpoint at the start of my_sock_read in postgres.exe then you could 
see if my_sock_read(...) was being called repeatedly or just once.

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general