Re: [RFC NEXT] mm: Fix suspicious RCU usage at kernel/sched/core.c:7318

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 16, 2015 at 09:44:24AM +0100, Catalin Marinas wrote:
> On Mon, Jun 15, 2015 at 10:25:18PM +0100, Larry Finger wrote:
> > Beginning at commit d52d399, the following INFO splat is logged:
> > 
> > [    2.816564] ===============================
> > [    2.816986] [ INFO: suspicious RCU usage. ]
> > [    2.817402] 4.1.0-rc7-next-20150612 #1 Not tainted
> > [    2.817881] -------------------------------
> > [    2.818297] kernel/sched/core.c:7318 Illegal context switch in RCU-bh read-side critical section!
> > [    2.819180]
> > other info that might help us debug this:
> > 
> > [    2.819947]
> > rcu_scheduler_active = 1, debug_locks = 0
> > [    2.820578] 3 locks held by systemd/1:
> > [    2.820954]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff815f0c8f>] rtnetlink_rcv+0x1f/0x40
> > [    2.821855]  #1:  (rcu_read_lock_bh){......}, at: [<ffffffff816a34e2>] ipv6_add_addr+0x62/0x540
> > [    2.822808]  #2:  (addrconf_hash_lock){+...+.}, at: [<ffffffff816a3604>] ipv6_add_addr+0x184/0x540
> > [    2.823790]
> > stack backtrace:
> > [    2.824212] CPU: 0 PID: 1 Comm: systemd Not tainted 4.1.0-rc7-next-20150612 #1
> > [    2.824932] Hardware name: TOSHIBA TECRA A50-A/TECRA A50-A, BIOS Version 4.20   04/17/2014
> > [    2.825751]  0000000000000001 ffff880224e07838 ffffffff817263a4 ffffffff810ccf2a
> > [    2.826560]  ffff880224e08000 ffff880224e07868 ffffffff810b6827 0000000000000000
> > [    2.827368]  ffffffff81a445d3 00000000000004f4 ffff88022682e100 ffff880224e07898
> > [    2.828177] Call Trace:
> > [    2.828422]  [<ffffffff817263a4>] dump_stack+0x4c/0x6e
> > [    2.828937]  [<ffffffff810ccf2a>] ? console_unlock+0x1ca/0x510
> > [    2.829514]  [<ffffffff810b6827>] lockdep_rcu_suspicious+0xe7/0x120
> > [    2.830139]  [<ffffffff8108cf05>] ___might_sleep+0x1d5/0x1f0
> > [    2.830699]  [<ffffffff8108cf6d>] __might_sleep+0x4d/0x90
> > [    2.831239]  [<ffffffff811f3789>] ? create_object+0x39/0x2e0
> > [    2.831800]  [<ffffffff811da427>] kmem_cache_alloc+0x47/0x250
> > [    2.832375]  [<ffffffff813c19ae>] ? find_next_zero_bit+0x1e/0x20
> > [    2.832973]  [<ffffffff811f3789>] create_object+0x39/0x2e0
> > [    2.833515]  [<ffffffff810b7eb6>] ? mark_held_locks+0x66/0x90
> > [    2.834089]  [<ffffffff8172efab>] ? _raw_spin_unlock_irqrestore+0x4b/0x60
> > [    2.834761]  [<ffffffff817193c1>] kmemleak_alloc_percpu+0x61/0xe0
> > [    2.835369]  [<ffffffff811a26f0>] pcpu_alloc+0x370/0x630
> > 
> > Additional backtrace lines are truncated. In addition, the above splat is
> > followed by several "BUG: sleeping function called from invalid context
> > at mm/slub.c:1268" outputs. As suggested by Martin KaFai Lau, these are the
> > clue to the fix. Routine kmemleak_alloc_percpu() always uses GFP_KERNEL
> > for its allocations, whereas it should use the value input to pcpu_alloc().
> > 
> > Signed-off-by: Larry Finger <Larry.Finger@xxxxxxxxxxxx>
> > Cc: Martin KaFai Lau <kafai@xxxxxx>
> > Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> > To: Tejun Heo <tj@xxxxxxxxxx>
> > Cc: Christoph Lameter <cl@xxxxxxxxxxxxxxxxxxxx>
> > Cc: linux-kernel@xxxxxxxxxxxxxxx
> > Cc: linux-mm@xxxxxxxxx
[...]
> Apart from the minor comment above (and the kmemleak.c.rej file):
> 
> Reviewed-by: Catalin Marinas <catalin.marinas@xxxxxxx>

BTW, it's worth adding:

Cc: <stable@xxxxxxxxxxxxxxx> # v3.18+

(or Fixes: 5835d96e9ce4 percpu: implement [__]alloc_percpu_gfp())

-- 
Catalin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]