Martin Hewitt wrote: > Hi Mark, > > I've exhausted the Java avenues for debugging this issue, but, since > my last email, the process I pointed strace at has been killed, but > I'm afraid the rather raw format of the strace file is lost on me. > The last six lines of the ouput file are: > > clone(child_stack=0x4202a250, > At a guess, looks like it's creating a child process. <snip> > futex(0x4202a9d0, FUTEX_WAIT, 23241, NULL) = -1 EINTR (Interrupted system > call) > --- SIGHUP (Hangup) @ 0 (0) --- > futex(0x2ab0b620a000, FUTEX_WAKE_PRIVATE, 1) = 1 > rt_sigreturn(0x2ab0b620a000) = -1 EINTR (Interrupted system > call) > futex(0x4202a9d0, FUTEX_WAIT, 23241, NULL <unfinished ... exit status 129> > > The SIGHUP is new information, and appears to be what's causing the > java app to exit. Surely Java should be aware of the Interrupted > system call? > > There are no other signals in the output file, and the only EINTRs are > in the passage above. > Does the exit status of 129 say anything other than SIGHUP? > Looks like I need to delve back into Java... > Yeah. I think you need to try what I was suggesting: start wrapping function calls in try/catch, and work your way down (when you find the one it fails in, then go into that function, er, method, and wrap the calls in there (and/or even put a writeln in a few choice spots, until you find the exact function the SIGHUP (or whatever) is happening in. mark "why, yes, I *was* a developer longer than I've been an admin" > Martin > > On 10 February 2011 19:37, <m.roth@xxxxxxxxx> wrote: >> Hey, Martin, >> >> Martin Hewitt wrote: >>> >>> Thanks, I didn't know about the strace command, so that's useful. >>> Fortunately, this is on a dedicated server, so there's a fair amount >>> of free disk. >> <snip> >> If you can do the code changes (and the try/catch is *supposed* to be in >> there, according to java style), work your way down, y'know... >> >> main >> >> ... >> try { >> First actual call to do the job >> } catch >> writeln error; >> >> And if it fails there, then you know; otherwise, go to the next main >> call, >> sorry, "invocation of a method".... >> >> Then again, this time in each of the main function calls under that, and >> step down until you find the function it's dying in. That'll give you a >> much better handle on what's happening. >> >>> Thanks for the help. >>> >> Good luck. >> >> mark >>> Martin >>> >>> On 10 February 2011 18:58, <m.roth@xxxxxxxxx> wrote: >>>> Martin Hewitt wrote: >>>>> Hi all, >>>>> >>>>> I'm running CentOS 5.5 Final, Java version "1.6.0_17" OpenJDK Runtime >>>>> Environment (IcedTea6 1.7.5) (rhel-1.16.b17.el5-x86_64) OpenJDK >>>>> 64-Bit >>>>> Server VM (build 14.0-b16, mixed mode) installed via Yum. >>>>> >>>>> We have a java application, packaged as a jar, running on our servers >>>>> which, periodically, crawls RSS feeds and writes the articles to a >>>>> database. >>>>> >>>>> Randomly, and seemingly without cause, these processes will die, not >>>>> through the application exiting, or due to my killing it, but due to >>>>> something that seems to kill without leaving a trace. >>>> <snip> >>>> The hard (but correct) way would be to put try {} catch in the code, >>>> and >>>> work your way down. Trying to debug it using a debugger might be real >>>> problematical, if you can't repeatably provoke it. I *suppose* you >>>> could >>>> attach strace to it, and dump the o/p into a file (on a filesystem >>>> with >>>> a >>>> *lot* of disk space).... >>>> >>>> mark >>>> >>>> _______________________________________________ >>>> CentOS mailing list >>>> CentOS@xxxxxxxxxx >>>> http://lists.centos.org/mailman/listinfo/centos >>>> >>> _______________________________________________ >>> CentOS mailing list >>> CentOS@xxxxxxxxxx >>> http://lists.centos.org/mailman/listinfo/centos >>> >> >> >> _______________________________________________ >> CentOS mailing list >> CentOS@xxxxxxxxxx >> http://lists.centos.org/mailman/listinfo/centos >> > _______________________________________________ > CentOS mailing list > CentOS@xxxxxxxxxx > http://lists.centos.org/mailman/listinfo/centos > _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx http://lists.centos.org/mailman/listinfo/centos