On Thu, 2008-06-26 at 15:41 -0500, Jason L Tibbitts III wrote: > >>>>> "JB" == Josh Boyer <jwboyer@xxxxxxxxx> writes: > > JB> That might have had a bigger effect. I though koji would only run > JB> one build job per builder? Or is it per CPU? > > I don't know what koji does, but in this case koji was unaware that > the jobs were still running. I guess they had been killed from the > server but not cleaned up on the builders. This happened a lot with plague too. I think it's Just Hard in *NIX to ensure that all ancestors of a given task have been killed dead dead dead. Maybe they somehow get out of the parent's process group, they are just hung and don't respond to signals, they are in D state when the signals get sent, whatever. Running craploads of scripts and programs as part of the build process that fork and exec and do God-knows-what doesn't lend itself to being cleaned up easily. I think either cgroups (?) or putting each build in a clean VM which can be torn down completely is probably the answer. And out of those two, a whole new VM would be pretty heavy to create/destroy so it's probably out of the question. Dan -- fedora-devel-list mailing list fedora-devel-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/fedora-devel-list