On Wed, Dec 14, 2011 at 2:11 PM, Linus Walleij <linus.walleij@xxxxxxxxxx> wrote: > On Mon, Dec 5, 2011 at 8:55 PM, Vincent Li <vincent.mc.li@xxxxxxxxx> wrote: > >> we have a complex system with a large number of processes running >> simutanously. If any of the processes gets into a faulty state and >> hangs or consumes more than its fair share of the system resources, >> the other processes may not get a chance to run, and the whole system >> can hang, interrupting the system functionality and user traffic. > > Have you tried using RLIMITs? > > Last time I used something like this from each process: > > #include <sys/time.h> > #include <sys/resource.h> > > struct rlimit rl; > int ret; > > // No process run more than 5 seconds > rl.rlim_cur = rl.rlim_max = 5; > ret = setrlimit(RLIMIT_CPU, &rl); > // No realtime process run more than 1 second > rl.rlim_cur = rl.rlim_max = 1000000; > ret = setrlimit(RLIMIT_RTTIME, &rl); > > The latter is good if you have real-time processes. > > There are also RLIMITs for memory consumption. > > Consult: > http://kernel.org/doc/man-pages/online/pages/man2/getrlimit.2.html > thank you for the link, I will look into it. Vincent -- To unsubscribe from this list: send the line "unsubscribe linux-watchdog" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html