On Wed, 2014-05-21 at 23:40 -0400, Dave Jones wrote: > On Thu, May 22, 2014 at 12:40:36PM +1000, Michael Ellerman wrote: > > > Sorry I didn't get back to you on this. I've been chasing a bug that trinity > > found for us. > > > > Running aae6d6a I've seen this once, but only once: > > > > [watchdog] Sanity check failed! Found pid 1885550132! > > [watchdog] problem checking on pid 112 (1:Operation not permitted) > > [watchdog] pid 1885550132 has disappeared (oom-killed maybe?). Reaping. > > [watchdog] pid 678326126 has disappeared (oom-killed maybe?). Reaping. > > [watchdog] pid 1697185792 has disappeared (oom-killed maybe?). Reaping. > > [watchdog] Reaped 3 dead children > > Killed > > If it happens again, check /proc/sys/kernel/pid_max. > I wonder if something scribbled in there. It hasn't happened again, but I haven't rebooted since it did, and I still have: $ cat /proc/sys/kernel/pid_max 65536 > Though looking at the pids in the dump above, I wonder if there's > something more screwed up, like we corrupted the ptrs to the pid map > in the shm. Yeah it looks more like that to me. cheers -- To unsubscribe from this list: send the line "unsubscribe trinity" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html