On May 13, 2008 10:31 +0800, Tiger Yang wrote: > This situation only happens we format ext3/4 with inode size more than 128 > and we have put xattr entries both in ibody and block. > The consequences about this bug is we will lost the xattr block which > pointed by i_file_acl with all xattr entires in it. We will alloc a new > xattr block and put that large value entry in it. The old xattr block will > become orphan block. Tiger, thanks for finding this bug, and the patch (which fixes the problem in our testing). Signed-off-by: Andreas Dilger <adilger@xxxxxxx> > Andrew Morton wrote: >> On Mon, 12 May 2008 11:24:40 +0800 >> Tiger Yang <tiger.yang@xxxxxxxxxx> wrote: >> >> >>> I met a bug when I try to replace a xattr entry in ibody with a big size >>> value. But in ibody there has no space for the new value. So it should >>> set new xattr entry in block and remove the old xattr entry in ibody. >>> >>> Best regards, >>> tiger >>> >>> >>> [xattr.patch text/x-patch (1.3KB)] >>> This fix the uninitialized bs when we try to replace a xattr entry in ibody with the new value which require more than free space. >>> >>> Signed-off-by: Tiger Yang <tiger.yang@xxxxxxxxxx> >>> >>> diff --git a/fs/ext3/xattr.c b/fs/ext3/xattr.c >>> index a6ea4d6..e1af9bd 100644 >>> --- a/fs/ext3/xattr.c >>> +++ b/fs/ext3/xattr.c >>> @@ -1000,6 +1000,11 @@ ext3_xattr_set_handle(handle_t *handle, struct inode *inode, int name_index, >>> i.value = NULL; >>> error = ext3_xattr_block_set(handle, inode, &i, &bs); >>> } else if (error == -ENOSPC) { >>> + if (EXT3_I(inode)->i_file_acl && !bs.s.base) { >>> + error = ext3_xattr_block_find(inode, &i, &bs); >>> + if (error) >>> + goto cleanup; >>> + } >>> error = ext3_xattr_block_set(handle, inode, &i, &bs); >>> if (error) >>> goto cleanup; >>> >> >> That sounds fairly bad. >> >> What are the consequences of this bug, when someone hits it? The EAs in the external block (except the one being added) are lost, and some blocks (or shared EA block references) are leaked. In most cases this is not fatal, but for Lustre I developed a test case where this causes the file data to be lost (because the file layout is stored in the external block if it is too large to store in the inode). >> It appears that we should backport this fix into 2.6.25.x (and perhaps >> earlier). What do you think? Code inspection shows this bug goes back to when the fast EA-in-inode support was added to the vanilla kernel, at least 2.6.12 (when Git history begins). Sadly, the bug was NOT in the original CFS EA-in-inode patches that we made for kernels 2.6.5 (SLES 9) and 2.6.9 (RHEL 4) (and still use today) that were in 2.6.11-rc1-mm1, but were added during the later rewrite of this code. I suspect the reasons this bug hasn't been reported even when large inodes are enabled (which is the default for newer e2fsprogs) are: - it uncommon to have multiple EAs on a file (usually SELinux is the only common one and it is relatively small) - one of the EAs must already be too large to fit in the inode - increasing the size of any EA after it is created is rare Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html