On 4/24/21 8:27 AM, Peter.Enderborg@xxxxxxxx wrote: > On 4/24/21 4:41 PM, Guenter Roeck wrote: >> On 4/24/21 3:25 AM, Peter Enderborg wrote: >>> This is not a rebooting watchdog. It's function is to take other >>> actions than a hard reboot. On many complex system there is some >>> kind of manager that monitor and take action on slow systems. >>> Android has it's lowmemorykiller (lmkd), desktops has earlyoom. >>> This watchdog can be used to help monitor to preform some basic >>> action to keep the monitor running. >>> >>> It can also be used standalone. This add a policy that is >>> killing the process with highest oom_score_adj and using >>> oom functions to it quickly. I think it is a good usecase >>> for the patch. Memory siuations can be problematic for >>> software that monitor system, but other prolicys can >>> should also be possible. Like picking tasks from a memcg, or >>> specific UID's or what ever is low priority. >>> --- >> NACK. Besides this not following the new watchdog API, the task >> of a watchdog is to reset the system on failure. Its task is most >> definitely not to re-implement the oom killer in any way, shape, >> or form. >> >> Guenter > > Do you have better idea where the re-invented wheel might > fit better if it not for watchdog API? > The watchdog subsystem does support pretimeouts and a variety of configurable pretimeout notifiers. A pretimeout notifier which invokes the oom killer might be something worth discussing, though it would require an audience large enough to determine if it really makes sense (instead of improving the existing oom killer itself). A possible alternative might be to introduce watchdog pretimeout callbacks; this has actually be proposed in another context but without upstream user. The oom killer could then subscribe to watchdog pretimeouts and perform the action suggested here if a pretimeout is observed. Again, such an approach might be worth discussing with a larger audience. Thanks, Guenter