On Tue, Nov 20, 2012 at 12:53 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > On Mon, Nov 19, 2012 at 07:50:06AM +0100, Torsten Kaiser wrote: > So, both lockdep thingy's are the same: I suspected this, but as the reports where slightly different I attached bith of them, as I couldn't decide witch one was the better/simpler report to debug this. >> [110926.972482] ========================================================= >> [110926.972484] [ INFO: possible irq lock inversion dependency detected ] >> [110926.972486] 3.7.0-rc4 #1 Not tainted >> [110926.972487] --------------------------------------------------------- >> [110926.972489] kswapd0/725 just changed the state of lock: >> [110926.972490] (sb_internal){.+.+.?}, at: [<ffffffff8122b268>] xfs_trans_alloc+0x28/0x50 >> [110926.972499] but this lock took another, RECLAIM_FS-unsafe lock in the past: >> [110926.972500] (&(&ip->i_lock)->mr_lock/1){+.+.+.} > > Ah, what? Since when has the ilock been reclaim unsafe? > >> [110926.972500] and interrupts could create inverse lock ordering between them. >> [110926.972500] >> [110926.972503] >> [110926.972503] other info that might help us debug this: >> [110926.972504] Possible interrupt unsafe locking scenario: >> [110926.972504] >> [110926.972505] CPU0 CPU1 >> [110926.972506] ---- ---- >> [110926.972507] lock(&(&ip->i_lock)->mr_lock/1); >> [110926.972509] local_irq_disable(); >> [110926.972509] lock(sb_internal); >> [110926.972511] lock(&(&ip->i_lock)->mr_lock/1); >> [110926.972512] <Interrupt> >> [110926.972513] lock(sb_internal); > > Um, that's just bizzare. No XFS code runs with interrupts disabled, > so I cannot see how this possible. > > ..... > > > [<ffffffff8108137e>] mark_held_locks+0x7e/0x130 > [<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0 > [<ffffffff810e9dd5>] kmem_cache_alloc+0x35/0xe0 > [<ffffffff810dba31>] vm_map_ram+0x271/0x770 > [<ffffffff811e1316>] _xfs_buf_map_pages+0x46/0xe0 > [<ffffffff811e222a>] xfs_buf_get_map+0x8a/0x130 > [<ffffffff81233ab9>] xfs_trans_get_buf_map+0xa9/0xd0 > [<ffffffff8121bced>] xfs_ialloc_inode_init+0xcd/0x1d0 > > We shouldn't be mapping buffers there, there's a patch below to fix > this. It's probably the source of this report, even though I cannot > lockdep seems to be off with the fairies... I also tried to understand what lockdep was saying, but Documentation/lockdep-design.txt is not too helpful. I think 'CLASS'-ON-R / -ON-W means that this lock was 'ON' / held while 'CLASS' (HARDIRQ, SOFTIRQ, RECLAIM_FS) happend and that makes this lock unsafe for these contexts. IN-'CLASS'-R / -W seems to be 'lock taken in context 'CLASS'. A note that 'CLASS'-ON-? means 'CLASS'-unsafe in there would be helpful to me... Wrt. above interrupt output: I think lockdep doesn't really know about RECLAIM_FS and threats it as another interrupt. I think that output should have been something like this: CPU0 CPU1 ---- ---- lock(&(&ip->i_lock)->mr_lock/1); <Allocation enters reclaim> lock(sb_internal); lock(&(&ip->i_lock)->mr_lock/1); <Allocation enters reclaim> lock(sb_internal); Entering reclaim on CPU1 would mean that CPU1 would not enter reclaim again, so the reclaim-'interrupt' would be disabled. And instead of interrupts disrupting the normal codeflow on CPU0 it would be 'interrupted' be instead of doing a normal allocation, it would 'interrupt' the allocation to reclaim memory. print_irq_lock_scenario() would need to be taught to print a slightly different message for reclaim-'interrupts'. I will try your patch, but as I do not have a reliable reproducer to create this lockdep report, I can't really verify if this fixes it. But I will definitely mail you, if it happens again with this patch. Thanks, Torsten > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx > > xfs: inode allocation should use unmapped buffers. > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > Inode buffers do not need to be mapped as inodes are read or written > directly from/to the pages underlying the buffer. This fixes a > regression introduced by commit 611c994 ("xfs: make XBF_MAPPED the > default behaviour"). > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > --- > fs/xfs/xfs_ialloc.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_ialloc.c b/fs/xfs/xfs_ialloc.c > index 2d6495e..a815412 100644 > --- a/fs/xfs/xfs_ialloc.c > +++ b/fs/xfs/xfs_ialloc.c > @@ -200,7 +200,8 @@ xfs_ialloc_inode_init( > */ > d = XFS_AGB_TO_DADDR(mp, agno, agbno + (j * blks_per_cluster)); > fbuf = xfs_trans_get_buf(tp, mp->m_ddev_targp, d, > - mp->m_bsize * blks_per_cluster, 0); > + mp->m_bsize * blks_per_cluster, > + XBF_UNMAPPED); > if (!fbuf) > return ENOMEM; > /* _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs