Re: regression in ext4_ind_remove_space

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 1 Feb 2015, Chris J Arges wrote:

> Date: Sun, 01 Feb 2015 15:46:06 -0600
> From: Chris J Arges <chris.j.arges@xxxxxxxxxxxxx>
> To: Lukas Czerner <lczerner@xxxxxxxxxx>, Theodore Ts'o <tytso@xxxxxxx>,
>     Jan Kara <jack@xxxxxxx>
> Cc: linux-fsdevel@xxxxxxxxxxxxxxx
> Subject: Re: regression in ext4_ind_remove_space
> 
> 
> 
> On 01/30/2015 03:14 PM, Chris J Arges wrote:
> > 
> > 
> > On 01/30/2015 12:56 PM, Chris J Arges wrote:
> >> Hi,
> >>
> >> Users of non-extent ext4 filesystems (ext4 ^extents, or ext3 w/
> >> CONFIG_EXT4_USE_FOR_EXT23=y) can encounter data corruption when using
> >> fallocate with FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE flags. This
> >> seems to be a regression in ext4_ind_remove_space introduced in
> >> 4f579ae7, whereas commit 77ea2a4b passes the following test case.
> >>  
> >> To reproduce this issue do the following:
> >> 1) Setup ext4 ^extents, or ext3 filesystem with CONFIG_EXT4_USE_FOR_EXT23=y
> >> 2) Create and install a VM using a qcow2 image and store the file on the
> >> filesystem
> >> 3) Snapshot the image with qemu-img
> >> 4) Boot the image and do some disk operations (fio,etc)
> >> 5) Shutdown image and delete snapshot
> >> 6) Repeat 3-5 until VM no longer boots due to image corruption,
> >> generally this takes a few iterations depending on disk operations.
> >>
> >> In addition, I've tested this with a single vCPU and single host CPU and
> >> the problem persists. Running the same test on ext4 w/ extents exhibits
> >> no failures.
> >>
> >> Any ideas for bug hunting here? Commit 4f579ae7 completely re-writes
> >> ext4_ind_remove_space. A revert of that commit from master fixes the
> >> issue, but I'm unsure if that un-fixes other things.
> > 
> > Well, after a bit more extensive testing a revert of this patch still
> > has failures. This may just be a novel bug.
> > 
> > I'll run longer tests on 77ea2a4b as well.
> > --chris
> 
> 77ea2a4b eventually fails (~20 iterations) but seems to be a bit more
> resilient. So this does not seem like a regression; but still
> ext4_ind_remove_space is at the heart of the issue. If I use the
> following code snippet below, I can run the test case for days (~400
> iterations so far) without failure:
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 5653fa4..e14cdfe 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3367,6 +3367,10 @@ int ext4_punch_hole(struct inode *inode, loff_t
> offset, loff_t length)
>  	unsigned int credits;
>  	int ret = 0;
> 
> +	/* EXTENTS required */
> +	if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)))
> +		return -EOPNOTSUPP;
> +
>  	if (!S_ISREG(inode->i_mode))
>  		return -EOPNOTSUPP;

Hi Chris,

unfortunately you're right. I've noticed this recently as well and I
am looking into this, however the indirect block handling code is
damn complicated (but maybe it's just me) and I was not able to find
the root cause of the problem yet.

Btw, this can also be very easily reproduced by running xfstest
generic/270 on 1k file system. That way I am able to reproduce the
problem in more than 90% runs.

Now since Al Viro was the one who wrote most of the indirect code and
since he is much smarter than me, maybe we can kindly ask him to
take a look at the ext4_ind_remove_space() to figure out where the
problem is ?

Al, could you take a quick look at this ?

Thanks!
-Lukas

> 
> 
> --chris
> 
> > 
> >  I'm happy to
> >> continue debugging, or run any tests as necessary.
> >>
> >> Thanks,
> >> --chris j arges
> >>
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux