In message <20080325182909.GD21732@xxxxxxxxxxxxxxxxxxxxxxxx>, Jan Kara writes: > > --sdtB3X0nJg68CQEu > Content-Type: text/plain; charset=us-ascii > Content-Disposition: inline > > > I was building a kernel using "make -j 4" inside a unionfs, mounted on top > > of ext3. The kernel is vanilla 2.6.25-rc6 plus unionfs patches. At some > > point I forced a cache flush using "echo 3 > /proc/sys/vm/drop_caches" and I > > got the lockdep warning below. Note that unionfs doesn't appear to be > > involved in this lockdep warning at all, so I suspect this is probably an > > issue between jbd and ext3 directly. > > > > Let me know if I can be of more help. > Actually, I have a fix for that - attached. Can you try it? Thanks. > Honza > > > [31270.749254] ======================================================= > > [31270.749473] [ INFO: possible circular locking dependency detected ] > > [31270.749720] 2.6.25-rc6-unionfs2 #69 > > [31270.749831] ------------------------------------------------------- > > [31270.749958] flush-caches.sh/9702 is trying to acquire lock: > > [31270.750171] (&journal->j_list_lock){--..}, at: [<c029ef6a>] journal_try_to_free_buffers+0xa5/0x158 > > [31270.751607] > > [31270.751607] but task is already holding lock: > > [31270.751607] (inode_lock){--..}, at: [<c02718bb>] drop_pagecache+0x45/0xd3 > > [31270.751607] > > [31270.751607] which lock already depends on the new lock. > > [31270.751607] > > [31270.751607] > > [31270.751607] the existing dependency chain (in reverse order) is: > > [31270.751607] > > [31270.751607] -> #1 (inode_lock){--..}: > > [31270.751607] [<c0231e01>] __lock_acquire+0x934/0xab8 > > [31270.751607] [<c0232307>] lock_acquire+0x4e/0x6a > > [31270.751607] [<c0395e33>] _spin_lock+0x23/0x32 > > [31270.751607] [<c02713d2>] __mark_inode_dirty+0xbe/0x134 > > [31270.751607] [<c0274b3a>] __set_page_dirty+0xdd/0xec > > [31270.751607] [<c0274ba9>] mark_buffer_dirty+0x60/0x63 > > [31270.751607] [<c029d2f6>] __journal_temp_unlink_buffer+0xc9/0xce > > [31270.751607] [<c029d43b>] __journal_unfile_buffer+0xb/0x15 > > [31270.751607] [<c029d487>] __journal_refile_buffer+0x42/0x8c > > [31270.751607] [<c029fdd1>] journal_commit_transaction+0xc59/0xe64 > > [31270.751607] [<c02a29f1>] kjournald+0xae/0x1b0 > > [31270.751607] [<c0228833>] kthread+0x3b/0x64 > > [31270.751607] [<c02039d3>] kernel_thread_helper+0x7/0x10 > > [31270.751607] [<ffffffff>] 0xffffffff > > [31270.751607] > > [31270.751607] -> #0 (&journal->j_list_lock){--..}: > > [31270.751607] [<c0231d28>] __lock_acquire+0x85b/0xab8 > > [31270.751607] [<c0232307>] lock_acquire+0x4e/0x6a > > [31270.751607] [<c0395e33>] _spin_lock+0x23/0x32 > > [31270.751607] [<c029ef6a>] journal_try_to_free_buffers+0xa5/0x158 > > [31270.751607] [<c02925fe>] ext3_releasepage+0x53/0x5c > > [31270.751607] [<c023e4bd>] try_to_release_page+0x32/0x3f > > [31270.751607] [<c0243f59>] __invalidate_mapping_pages+0x69/0xc6 > > [31270.751607] [<c02718de>] drop_pagecache+0x68/0xd3 > > [31270.751607] [<c0271972>] drop_caches_sysctl_handler+0x29/0x3f > > [31270.751607] [<c0289383>] proc_sys_write+0x61/0x7e > > [31270.751607] [<c02594a1>] vfs_write+0x8c/0x108 > > [31270.751607] [<c0259a2b>] sys_write+0x3b/0x60 > > [31270.751607] [<c0202cee>] sysenter_past_esp+0x5f/0xa5 > > [31270.751607] [<ffffffff>] 0xffffffff > > [31270.751607] > > [31270.751607] other info that might help us debug this: > > [31270.751607] > > [31270.751607] 2 locks held by flush-caches.sh/9702: > > [31270.751607] #0: (&type->s_umount_key#12){----}, at: [<c02718ab>] drop_pagecache+0x35/0xd3 > > [31270.751607] #1: (inode_lock){--..}, at: [<c02718bb>] drop_pagecache+0x45/0xd3 > > [31270.751608] > > [31270.751608] stack backtrace: > > [31270.751608] Pid: 9702, comm: flush-caches.sh Not tainted 2.6.25-rc6-unionfs2 #69 > > [31270.751608] [<c0230687>] print_circular_bug_tail+0x5b/0x66 > > [31270.751608] [<c0231d28>] __lock_acquire+0x85b/0xab8 > > [31270.751608] [<c0232307>] lock_acquire+0x4e/0x6a > > [31270.751608] [<c029ef6a>] ? journal_try_to_free_buffers+0xa5/0x158 > > [31270.751608] [<c0395e33>] _spin_lock+0x23/0x32 > > [31270.751608] [<c029ef6a>] ? journal_try_to_free_buffers+0xa5/0x158 > > [31270.751608] [<c029ef6a>] journal_try_to_free_buffers+0xa5/0x158 > > [31270.751608] [<c02925fe>] ext3_releasepage+0x53/0x5c > > [31270.751608] [<c023e4bd>] try_to_release_page+0x32/0x3f > > [31270.751608] [<c0243f59>] __invalidate_mapping_pages+0x69/0xc6 > > [31270.751608] [<c02718de>] drop_pagecache+0x68/0xd3 > > [31270.751608] [<c0271972>] drop_caches_sysctl_handler+0x29/0x3f > > [31270.751608] [<c0289383>] proc_sys_write+0x61/0x7e > > [31270.751608] [<c0289322>] ? proc_sys_write+0x0/0x7e > > [31270.751608] [<c02594a1>] vfs_write+0x8c/0x108 > > [31270.751608] [<c0259a2b>] sys_write+0x3b/0x60 > > [31270.751608] [<c0202cee>] sysenter_past_esp+0x5f/0xa5 > > [31270.751608] ======================= > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > -- > Jan Kara <jack@xxxxxxx> > SuSE CR Labs > > --sdtB3X0nJg68CQEu > Content-Type: text/x-diff; charset=us-ascii > Content-Disposition: attachment; filename="vfs-2.6.25-drop_pagecache_sb_fix.diff" > > From f5e41087e345fa5c3b46ac36e6e4a654d2f7f624 Mon Sep 17 00:00:00 2001 > From: Jan Kara <jack@xxxxxxx> > Date: Tue, 18 Mar 2008 14:38:06 +0100 > Subject: [PATCH] Fix drop_pagecache_sb() to not call __invalidate_mapping_pages() under > inode_lock. > > Signed-off-by: Jan Kara <jack@xxxxxxx> > --- > fs/drop_caches.c | 8 +++++++- > 1 files changed, 7 insertions(+), 1 deletions(-) > > diff --git a/fs/drop_caches.c b/fs/drop_caches.c > index 59375ef..f5aae26 100644 > --- a/fs/drop_caches.c > +++ b/fs/drop_caches.c > @@ -14,15 +14,21 @@ int sysctl_drop_caches; > > static void drop_pagecache_sb(struct super_block *sb) > { > - struct inode *inode; > + struct inode *inode, *toput_inode = NULL; > > spin_lock(&inode_lock); > list_for_each_entry(inode, &sb->s_inodes, i_sb_list) { > if (inode->i_state & (I_FREEING|I_WILL_FREE)) > continue; > + __iget(inode); > + spin_unlock(&inode_lock); > __invalidate_mapping_pages(inode->i_mapping, 0, -1, true); > + iput(toput_inode); > + toput_inode = inode; > + spin_lock(&inode_lock); > } > spin_unlock(&inode_lock); > + iput(toput_inode); > } Jan, I'll be happy to test this, but I don't understand two things about this patch: 1. Is it safe to unlock and re-lock inode_lock temporarily within the loop? 2. What's the motivation behind having the second toput_inode pointer? It appears that the first iteration through the loop, toput_inode will be NULL, so we'll be iput'ing a NULL pointer (which is ok). So you're trying to iput the previous inode pointer that the list iterated over, right? Is that intended? Peter's post: http://lkml.org/lkml/2008/3/23/202 includes a reference to a mail by Andrew which implies that the fix may be much more involved than what you outlined above, no? Thanks, Erez. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html