On Wed, Nov 30, 2016 at 11:21:59AM -0800, Guenter Roeck wrote: > On Wed, Nov 30, 2016 at 04:03:33AM -0800, Paul E. McKenney wrote: > > On Wed, Nov 30, 2016 at 02:52:11AM -0800, Guenter Roeck wrote: > > > On 11/29/2016 11:02 PM, Paul E. McKenney wrote: > > > >On Tue, Nov 29, 2016 at 08:32:51PM -0800, Guenter Roeck wrote: > > > >>On 11/29/2016 05:28 PM, Paul E. McKenney wrote: > > > >>>On Tue, Nov 29, 2016 at 01:23:08PM -0800, Guenter Roeck wrote: > > > >>>>Hi Paul, > > > >>>> > > > >>>>most of my qemu tests for sparc32 targets started to fail in next-20161129. > > > >>>>The problem is only seen in SMP builds; non-SMP builds are fine. > > > >>>>Bisect points to commit 2d66cccd73436 ("mm: Prevent __alloc_pages_nodemask() > > > >>>>RCU CPU stall warnings"); reverting that commit fixes the problem. > > > > And I have dropped this patch. Michal Hocko showed me the error of > > my ways with this patch. > > > > :-) > > On another note, I still get RCU tracebacks in the s390 tests. > > BUG: sleeping function called from invalid context at mm/page_alloc.c:3775 > > That is caused by 'rcu: Maintain special bits at bottom of ->dynticks counter'; > if I recall correctly we had discussed that earlier. Indeed, I had missed a dyntick counter update back on Nov 11, which meant that some of the code was still looking at the low-order bit instead of the next bit up. This is now fixed. So to get to the error message you call out above, I need to have improperly left the system in bh state or left irqs disabled, while the system was running normally without an oops. I am having a hard time seeing how this patch can do that. I would be more suspicious of f2a471ffc8a8 ("rcu: Allow boot-time use of cond_resched_rcu_qs()"). So you bisected or did a revert to work out which was the offending commit? Thanx, Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>