Re: [PATCH] kfence: Avoid stalling work queue task without allocations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 11, 2020 at 09:21:53PM +0100, Marco Elver wrote:
> On Wed, Nov 11, 2020 at 11:21AM -0800, Paul E. McKenney wrote:
> [...]
> > > >     rcu: Don't invoke try_invoke_on_locked_down_task() with irqs disabled
> > > 
> > > Sadly, no, next-20201110 already included that one, and that's what I
> > > tested and got me all those warnings above.
> > 
> > Hey, I had to ask!  The only uncertainty I seee is the acquisition of
> > the lock in rcu_iw_handler(), for which I add a lockdep check in the
> > (untested) patch below.  The other thing I could do is sprinkle such
> > checks through the stall-warning code on the assumption that something
> > RCU is calling is enabling interrupts.
> > 
> > Other thoughts?
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > index 70d48c5..3d67650 100644
> > --- a/kernel/rcu/tree_stall.h
> > +++ b/kernel/rcu/tree_stall.h
> > @@ -189,6 +189,7 @@ static void rcu_iw_handler(struct irq_work *iwp)
> >  
> >  	rdp = container_of(iwp, struct rcu_data, rcu_iw);
> >  	rnp = rdp->mynode;
> > +	lockdep_assert_irqs_disabled();
> >  	raw_spin_lock_rcu_node(rnp);
> >  	if (!WARN_ON_ONCE(!rdp->rcu_iw_pending)) {
> >  		rdp->rcu_iw_gp_seq = rnp->gp_seq;
> 
> This assert didn't fire yet, I just get more of the below. I'll keep
> rerunning, but am not too hopeful...

Is bisection a possibility?

Failing that, please see the updated patch below.  This adds a few more
calls to lockdep_assert_irqs_disabled(), but perhaps more helpfully dumps
the current stack of the CPU that the RCU grace-period kthread wants to
run on in the case where this kthread has been starved of CPU.

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 70d48c5..d203ea0 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -189,6 +189,7 @@ static void rcu_iw_handler(struct irq_work *iwp)
 
 	rdp = container_of(iwp, struct rcu_data, rcu_iw);
 	rnp = rdp->mynode;
+	lockdep_assert_irqs_disabled();
 	raw_spin_lock_rcu_node(rnp);
 	if (!WARN_ON_ONCE(!rdp->rcu_iw_pending)) {
 		rdp->rcu_iw_gp_seq = rnp->gp_seq;
@@ -449,21 +450,32 @@ static void print_cpu_stall_info(int cpu)
 /* Complain about starvation of grace-period kthread.  */
 static void rcu_check_gp_kthread_starvation(void)
 {
+	int cpu;
 	struct task_struct *gpk = rcu_state.gp_kthread;
 	unsigned long j;
 
 	if (rcu_is_gp_kthread_starving(&j)) {
+		cpu = gpk ? task_cpu(gpk) : -1;
 		pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) ->state=%#lx ->cpu=%d\n",
 		       rcu_state.name, j,
 		       (long)rcu_seq_current(&rcu_state.gp_seq),
 		       data_race(rcu_state.gp_flags),
 		       gp_state_getname(rcu_state.gp_state), rcu_state.gp_state,
-		       gpk ? gpk->state : ~0, gpk ? task_cpu(gpk) : -1);
+		       gpk ? gpk->state : ~0, cpu);
 		if (gpk) {
 			pr_err("\tUnless %s kthread gets sufficient CPU time, OOM is now expected behavior.\n", rcu_state.name);
 			pr_err("RCU grace-period kthread stack dump:\n");
+			lockdep_assert_irqs_disabled();
 			sched_show_task(gpk);
+			lockdep_assert_irqs_disabled();
+			if (cpu >= 0) {
+				pr_err("Stack dump where RCU grace-period kthread last ran:\n");
+				if (!trigger_single_cpu_backtrace(cpu))
+					dump_cpu_task(cpu);
+			}
+			lockdep_assert_irqs_disabled();
 			wake_up_process(gpk);
+			lockdep_assert_irqs_disabled();
 		}
 	}
 }




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux