On (21/07/16 14:41), Sergey Senozhatsky wrote: > @@ -657,6 +657,13 @@ static void check_cpu_stall(struct rcu_data *rdp) > unsigned long js; > struct rcu_node *rnp; > > + /* > + * If a virtual machine is stopped by the host it can look to > + * the watchdog like an RCU stall. Check to see if the host > + * stopped the vm. > + */ > + kvm_check_and_clear_guest_paused(); > + > lockdep_assert_irqs_disabled(); > if ((rcu_stall_is_suppressed() && !READ_ONCE(rcu_kick_kthreads)) || > !rcu_gp_in_progress()) > @@ -699,14 +706,6 @@ static void check_cpu_stall(struct rcu_data *rdp) > (READ_ONCE(rnp->qsmask) & rdp->grpmask) && > cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) { > > - /* > - * If a virtual machine is stopped by the host it can look to > - * the watchdog like an RCU stall. Check to see if the host > - * stopped the vm. > - */ > - if (kvm_check_and_clear_guest_paused()) > - return; > - > /* We haven't checked in, so go dump stack. */ > print_cpu_stall(gps); > if (READ_ONCE(rcu_cpu_stall_ftrace_dump)) > @@ -717,14 +716,6 @@ static void check_cpu_stall(struct rcu_data *rdp) > ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) && > cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) { > > - /* > - * If a virtual machine is stopped by the host it can look to > - * the watchdog like an RCU stall. Check to see if the host > - * stopped the vm. > - */ > - if (kvm_check_and_clear_guest_paused()) > - return; > - > /* They had a few time units to dump stack, so complain. */ > print_other_cpu_stall(gs2, gps); > if (READ_ONCE(rcu_cpu_stall_ftrace_dump)) This patch depends on https://lore.kernel.org/lkml/20210716053405.1243239-1-senozhatsky@xxxxxxxxxxxx/ If that x86/kvm patch lands, then we need to handle PVCLOCK_GUEST_STOPPED in watchdogs. In theory, this patch opens a small race window, if the VCPU gets preempted after kvm_check_and_clear_guest_paused() (external interrupt, etc.) But it's hard to tell how likely the problem is.