On Thu, Mar 5, 2020 at 10:07 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > > Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes: > > > On Wed, Mar 04, 2020 at 01:39:41PM -0800, Xi Wang wrote: > >> The main purpose of kernel watchdog is to test whether scheduler can > >> still schedule tasks on a cpu. In order to reduce latency from > >> periodically invoking watchdog reset in thread context, we can simply > >> touch watchdog from pick_next_task in scheduler. Compared to actually > >> resetting watchdog from cpu stop / migration threads, we lose coverage > >> on: a migration thread actually get picked and we actually context > >> switch to the migration thread. Both steps are heavily protected by > >> kernel locks and unlikely to silently fail. Thus the change would > >> provide the same level of protection with less overhead. > >> > >> The new way vs the old way to touch the watchdogs is configurable > >> from: > >> > >> /proc/sys/kernel/watchdog_touch_in_thread_interval > >> > >> The value means: > >> 0: Always touch watchdog from pick_next_task > >> 1: Always touch watchdog from migration thread > >> N (N>0): Touch watchdog from migration thread once in every N > >> invocations, and touch watchdog from pick_next_task for > >> other invocations. > >> > > > > This is configurable madness. What are we really trying to do here? > > Create yet another knob which will be advertised in random web blogs to > solve all problems of the world and some more. Like the one which got > silently turned into a NOOP ~10 years ago :) > The knob can obviously be removed, it's vestigial and reflects caution from when we were implementing / rolling things over to it. We have default values that we know work at scale. I don't think this actually needs or wants to be tunable beyond on or off (and even that could be strictly compile or boot time only).