Re: online resize of ext4 hung (3.2.51 / 1.42.5)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Oct 26, 2013 at 10:51:24PM +0200, Jakob Haufe wrote:
> On Fri, 25 Oct 2013 19:57:45 -0400
> Theodore Ts'o <tytso@xxxxxxx> wrote:
> 
> > Can you run "echo t > /proc/sysrq-trigger" and send the output from
> > the console (or from dmesg)?  Or otherwise trigger sysrq-t?  This will
> > show the stacks of all of the processes, which would be useful to
> > figure out what might be happening.
> 
> As the log was most probably too big to pass majordomo, i've put it here:
> 
> http://permalink.sur5r.net/1/linux-3.2.51-resize2fs-1.42.5-hung-sysrq-t.log

(Sorry for the delay in responding, a number of us have been attending
a conference in Edinburgh, and I'm currently on vacation in Dublin.)

>From looking at the sysrq-t which you sent, what looks like is going
on is that resize2fs is stuck in jbd2_journal_lock_updates().  That
function has incremented j_barrier_count, so all new attempts to start
a transaction handle are blocked, which explains the rest of the
processes stuck in start_this_handle().  Meanwhile,
jbd2_journal_lock_updates is waiting for the outstanding transactions
handles that have already been started against the handle to go to
zero --- and for some reason, this never happens.

One thing which I'm trying to figure out is why the resize2fs ioctl
needs to use the whole sequence of:

		jbd2_journal_lock_updates(EXT4_SB(sb)->s_journal);
		err2 = jbd2_journal_flush(EXT4_SB(sb)->s_journal);
		jbd2_journal_unlock_updates(EXT4_SB(sb)->s_journal);

anyway.  This flushes out the journal, but it's not obvious to me why
it's necessary --- and removing it would speed a file system resize
significantly.

In any case, I think it should be safe for you to reboot your file
system, and after an fsck -f, I think your file system should be OK.

			    	     	      - Ted

P.S.  To ext4 developers, please note that the kernel involved,
v3.2.52 does _not_ have Jan Kara's reserved handles changes, which
were added in commit 8f7d89f36829.  I at first thought it might have
been related to changes involving how jbd2_journal_lock_updates()
waits for j_reserved_credits to go to zero, but that was a blind
alley.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux