Re: [BUG] dw_wdt watchdog on linux-rt 4.18.5-rt4 not triggering

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 18, 2018 at 03:21:08PM +0200, Steffen Trumtrar wrote:
> 
> Hi all!
> 
> I'm seeing an issue with the dw_wdt watchdog on the SoCFPGA ARM platform
> with the latest linux-rt v4.18.5-rt4. Actually I seem to have the same
> problem, that these patches try to fix:
> 
>  38a1222ae4f364d5bd5221fe305dbb0889f45d15
>  Author:     Christophe Leroy <christophe.leroy@xxxxxx>
>  AuthorDate: Fri Dec 8 11:18:35 2017 +0100
>  Commit:     Wim Van Sebroeck <wim@xxxxxxxxx>
>  CommitDate: Thu Dec 28 20:45:57 2017 +0100
> 
>  Follows:    v4.15-rc3 (345)
>  Precedes:   v4.16-rc1 (13997)
> 
>  watchdog: core: make sure the watchdog worker always works
> 
>  When running a command like 'chrt -f 50 dd if=/dev/zero  of=/dev/null',
>  the watchdog_worker fails to service the HW watchdog and the
>  HW watchdog fires long before the watchdog soft timeout.
> 
>  At the moment, the watchdog_worker is invoked as a delayed work.
>  Delayed works are handled by non realtime kernel threads. The
>  WQ_HIGHPRI flag only increases the niceness of that threads.
> 
>  This patch replaces the delayed work logic by kthread delayed  work,
>  and sets the associated kernel task to SCHED_FIFO with the  highest
>  priority, in order to ensure that the watchdog worker will run  as
>  soon as possible.
> 
> 
>  1ff688209e2ed23f699269b9733993e2ce123fd2
>  Author:     Christophe Leroy <christophe.leroy@xxxxxx>
>  AuthorDate: Thu Jan 18 12:11:21 2018 +0100
>  Commit:     Wim Van Sebroeck <wim@xxxxxxxxx>
>  CommitDate: Sun Jan 21 12:44:59 2018 +0100
> 
>  Follows:    v4.15-rc3 (349)
>  Precedes:   v4.16-rc1 (13993)
> 
>  watchdog: core: make sure the watchdog_worker is not deferred
> 
>  commit 4cd13c21b207e ("softirq: Let ksoftirqd do its job") has  the
>  effect of deferring timer handling in case of high CPU load,  hence
>  delaying the delayed work allthought the worker is running which
>  high realtime priority.
> 
>  As hrtimers are not managed by softirqs, this patch replaces the
>  delayed work by a plain work and uses an hrtimer to schedule  that work.

These above two commits are trying very hard to ensure timely wakeup and
execution of the watchdogd thread.  First by moving moving to kthread
delayed work, and secondly to vanilla kthread work + hardirq.

This is sufficient on mainline, because hardirq expiry fns are
unconditionally executed in hardirq context.  With PREEMPT_RT_FULL,
however, the hrtimer expiry functions are executed in softirq context
unless explicitly opted out.

...meaning that w/ PREEMPT_RT_FULL the expiry (and therefore the
watchdogd wakeup) may be indefinitely starved if there are runnable RT
tasks of higher priority than the softirq callback thread (ktimersoftd @
SCHED_FIFO 1 by default).

This is an inversion.

One possible solution is to opt-out of the hrtimer softirq deferral by
making use of the HRTIMER_MODE_HARD, however, the expiry function will
need to be vetted for use in hardirq context w/ PREEMPT_RT_FULL.  From a
cursory glance at the kthread_worker locking, it is not hardirq safe. :-\

   Julia




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux