> Mike Burger wrote: >> If you have a process that is stuck in a zombie mode and kill -9 isn't >> getting rid of it, you may need to do something with the parent process >> that spawned it in the first place. > > Yeah, but too often, the parent process has gone, and the zombie's now got > a parent of 1. That would stink, yeah. :-( [Jack Allen] I thought I would add a few comments about this type of problem. If a process will not exit after a "kill -9 PID" has been done, then it is stuck waiting on the kernel to complete something on its behalf. When you "send a single" to a PID, you are not really send a single, you are only setting a bit in the process that indicates a single has been posted for that process. When the kernel schedules the process to run again the bits are looked at and handled as setup up by the process, single catchers. But -9 cannot be caught and processed by the process. The kernel will cause the process to exit. Now how can a process get stuck waiting on the kernel. Here is an example that use to happen quite often when 9trk tape drives were used. Many of you may have never seen one. Anyway, say some type of backup was being writing to a 9trk tape drive that is 2400 feet long. When the backup completed it may display a message to that affect and then close the file descriptor associated with the tape drive causing it to rewind the tape. Well it takes maybe 20 to 30 seconds or more to rewind the tape and the operator would push the online button during that time, to take it offline and push the unload button. The process is waiting for the kernel to let it know the tape has rewound and is back at load point and considered closed. This will never happen because the tape drive is now offline and will not generate an interrupt when the tape completes the rewind and is at load point. Therefore the operator does not get their prompt back or whatever should have happened next. You can do "kill -9 PID" on the process but it is not going to terminate. All they had to do was thread the tape and put it online again and the kernel received an interrupt from the device and determine a process was waiting to be woke up and wake it up. But if a "kill -9 PID" had been done the process will terminated, if not then it may display something else for the operator to do, like mount another tape. Now about a Zombie process. A Zombie is a process that has exited, wither that be because it called exit() or received some signal that caused it to end. It is in the Zombie state because its parent has not done a wait() to pick up its exit status. If the parent has exited then it is inherited by PID 1 (init). This is by design. When this happens, PID 1 is woke up and does a wait() which returns the PID and the exit status. It determines that it was not a PID it started and just ignores it. But the fact that it did the wait(), the PID is removed from the process table. So if you do "kill -9 PID" and the process does not become a Zombie, then it is stuck waiting on the kernel. It will do no good to kill the parent. If it does become a Zombie, and the parent does not do a wait(), then the parent has a bug or it is waiting on something and just has not gotten around to doing a wait(). ----- Jack Allen -- redhat-list mailing list unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe https://www.redhat.com/mailman/listinfo/redhat-list