Re: ext3 data=journal hangs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 11 Jan 2007 21:34:12 -0800
Randy Dunlap <randy.dunlap@xxxxxxxxxx> wrote:

> (resending for wider audience)
> 
> Date: Wed, 10 Jan 2007 16:03:51 -0800
> To: linux-ext4@xxxxxxxxxxxxxxx
> 
> 
> On Tue, 9 Jan 2007 15:11:23 -0800 Randy Dunlap wrote:
> 
> > Hi,
> > 
> > (2.6.20-rc4, x86_64 1-proc on SMP kernel, 1 GB RAM)
> > 
> > I'm running fsx-linux (akpm ext3-tools version) on an ext3 fs
> > with data=journal and fs blocksize=2048.  I've been trying to
> > get some kind of kernel messages from it but I can't get any
> > debug IO done successfully.
> > 
> > It has hung on me 3 times in a row today.  I'm using this command:
> > fsx-linux -l 100M -N 50000 -S 0 fsxtestfile
> > 
> > This is run in a new partition on a IDE drive (/dev/hda7,
> > using legacy IDE drivers).
> > 
> > Any suggestions for debug output?  I can see SysRq output on-screen
> > (sometimes) but it doesn't make it to my serial console.
> > 
> > Any patches to test?  :)
> 
> More notes:
> Fails (hangs) with fs blocksize of 1024, 2048, or 4096.
> On data=journal mode hangs.  writeback and ordered run fine.
> 
> After several runs (hangs), I was able to get some sysrq output
> to the serial console.
> 
> kernel config:  http://oss.oracle.com/~rdunlap/configs/config-2620-rc4-hangs
> message log:    http://oss.oracle.com/~rdunlap/logs/fsx-capture.txt
> 
> Can anyone see what fsx-linux is waiting on there?
> 

Everybody got stuck in balance_dirty_pages().  The new thing in there is
that an nscd instance got stuck in balance_dirty_pages() on the pagefault's
new set_page_dirty_balance() path, so an mmap_sem is stuck, which causes
lots of other things to get stuck.

But I don't see why this should happen, really.  It all seems OK here. Is
any IO happening at all?

You don't have any shells at all?  If you do, try running /bin/sync,
see if the disk lights up.  Run `watch -n1 cat /proc/meminfo' when testing
to see what dirty memory is doing.  And `vmstat 1'.  Try sysrq-S, see if
that gets things unstuck.

I guess it's consistent with the disk system losing its brains, too.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux