On 19/08/2009 1:34 PM, Brendan Hill wrote:
Hi Craig, thanks for the analysis. If I attach a debugger on the runaway
child process, will this halt execution for all the other child processes
(ie. freeze the server)? And, can I attach Visual Studio C++ 2008, or is
there a recommended debugger for Windows debugging?
Visual C++ 2008's debugger should be fine - and it's certainly a lot
nicer to use than windbg.exe . LOTS nicer. I'm surprised you have VS
2008 on your production server, though - or are you planning on using
remote debugging?
Anyway: If you attach to a given backend, execution of the other
backends won't freeze. If you promptly unpause execution of the backend
you attached to everything will run normally. You might not want to
interrupt the backend's execution for too long at a time though, as my
understanding is that Pg does have tasks that require synchronization
across all backends and leaving one in a state of paused execution for
too long might slow things down.
I did some quick testing before posting. First, I downloaded and
unpacked the 8.4.0 sources since that's what I'm running on my
workstation. I then establishined two sessions to an otherwise idle 8.4
DB on WinXP, then attaching VS 2008 EE's debugger to one of them:
Tools -> Attach to Process, check "show processes from all users",
select the target postgres.exe by pid, attach.
It took a while for VS to load symbols for the first time, but the other
backend was responsive during that time. When VS finished loading
symbols it auto-resumed execution of the backend.
When I pause execution the other backend remains responsive. I can
still establish new connections too.
With execution running normally I added a breakpoint at pq_recvbuf:
Debug -> New Breakpoint -> Break at Function (CTRL-B),
"pq_recvbuf", line 1 char 1 language "C", OK
then issued a query to the backend I was debugging. It processed the
query and then execution stopped at the breakpoint. I was prompted to
locate the source file I'd broken in, and when I did so it showed an
execution marker at the appropriate point, I could step execution
through the sources, etc.
When I was done, I just detached from the process with Tools -> Detach
All, leaving it running as before.
In your position I'd start by waiting until you have an out-of-control
backend, attaching to it without pausing it, and setting a breakpoint at
my_sock_read. If the breakpoint is hit then something's called
my_sock_read again; it won't trigger if my_sock_read is somewhere on the
call stack, only when the current point of execution enters the
function. You can step through execution from there see where it's looping.
If you find that my_sock_read isn't being called repeatedly, then the
infinite loop is in my_sock_read or something it's calling. Break into
execution and step through to see what Pg is doing.
Given the reliability of the server in the past, I'd probably be expecting
an issue with OpenSSL instead, but with debugging attached I should be able
to say for sure.
Yep. If, for example, you waited until a backend was in the problem
state where it was using 100% CPU, attached the debugger, and set a
breakpoint at the start of my_sock_read in postgres.exe then you could
see if my_sock_read(...) was being called repeatedly or just once.
--
Craig Ringer
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general