Re: Deadlock with nilfs on 2.6.31.4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Bruno,
On Thu, 22 Oct 2009 22:19:39 +0200, Bruno Prémont <bonbons@xxxxxxxxxxxxxxxxx> wrote:
> On Fri, 23 October 2009 Ryusuke Konishi <konishi.ryusuke@xxxxxxxxxxxxx> wrote:
> > Thank you for reporting the issue.
> > 
> > According to the log, the log-writer of nilfs looks to be idle even
> > though it has some requests waiting.
> > 
> > Could you try the following patch to narrow down the issue ?
> > 
> > I'll dig into this issue next week since I'm now away from my office
> > to attend the Linux symposium in Tokyo.
> > 
> > Thank you,
> > Ryusuke Konishi
> > 
> > 
> > diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
> > index 51ff3d0..0932571 100644
> > --- a/fs/nilfs2/segment.c
> > +++ b/fs/nilfs2/segment.c
> 
> I tried the patch, below is full dmesg output from system start-up to
> frozen syslog-ng (and collectd thread). (with echo t > /proc/sysrq-trigger)
> 
> Hard to tell at what time syslog-ng did freeze, but chances are big it's
> somewhere between 435.x and 591.x when nilfs stops sending/getting events.
> 
> The collectd instance in D-state is most probably the one that wants to
> write data to RRD file.
> 
> At least it looks very easy to reproduce! Just restarting collectd a few
> times and enabling its rrdtool plugin. (syslog-ng writing to one nilfs
> partition, collectd to another one, both on the same SD card)
> 
> Bruno
> 
<snip>

I found the cause of the hang issue reported on ARM targets.
The following patch would fix the issue.

It resolved hang problem on my Feroceon based Linux box.

Could you try if the patch fixes the hang of yours ?

Thanks,
Ryusuke Konishi

--
From: Ryusuke Konishi <konishi.ryusuke@xxxxxxxxxxxxx>

nilfs2: fix dirty page accounting leak causing hang at write

Some users experienced a consistent hang while using NILFS on
ARM-based targets.

I found this was caused by an underflow of dirty pages counter.  A
b-tree cache routine was marking page dirty without adjusting page
account information.

This fixes the dirty page accounting leak and resolves the hang on
arm-based targets.

Reported-by: Bruno Premont <bonbons@xxxxxxxxxxxxxxxxx>
Reported-by: Dunphy, Bill <WDunphy@xxxxxxxxxxxxxxxx>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@xxxxxxxxxxxxx>
---
 fs/nilfs2/btnode.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/fs/nilfs2/btnode.c b/fs/nilfs2/btnode.c
index 5941958..435864c 100644
--- a/fs/nilfs2/btnode.c
+++ b/fs/nilfs2/btnode.c
@@ -276,8 +276,7 @@ void nilfs_btnode_commit_change_key(struct address_space *btnc,
 				       "invalid oldkey %lld (newkey=%lld)",
 				       (unsigned long long)oldkey,
 				       (unsigned long long)newkey);
-		if (!test_set_buffer_dirty(obh) && TestSetPageDirty(opage))
-			BUG();
+		nilfs_btnode_mark_dirty(obh);
 
 		spin_lock_irq(&btnc->tree_lock);
 		radix_tree_delete(&btnc->page_tree, oldkey);
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux