FW: fix-layout stalls with xattr errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Shylesh,
Thanks for looking into this for me.  I think the ext4 features are 
missing because the filesystems were accidentally formatted as ext3 and 
then mounted as ext4.  I didn't realise that was possible until I 
started investigating this fix-layout problem.  I don't know how I 
managed to make the same mistake on both replicated bricks but I can't 
think of any other explanation.  I mounted the filesystems as ext3 and 
tried the rebalance again, but the result was the same.  Then I tried 
converting the filesystems to ext4, as described in various CentOS 
forums and blogs including this one: 
http://blog.secaserver.com/2011/08/linux-converting-ext3-ext4-for-centos-5. 
  Unfortunately the "Operation not supported" errors were still there 
during the fix-layout, so it seems that the damage has already been done 
by mounting the ext3 filesystems as ext4.   Perhaps xattrs on new files 
would be created correctly in the converted bricks, but I really need to 
find a way to repair the GlusterFS xattrs on the existing files.  Is 
there a way of doing this?

Regards
Dan.
> Hi Dan,
>
> I created two bricks both have ext4 file system.
>
> The issue seems to be in fs features that you have disabled.
>
>  Formatted the *brick1* with ext4:
>
> root at SERVER1 mnt]# dumpe2fs /dev/sda| grep 'Filesystem features'
> dumpe2fs 1.41.12 (17-May-2010)
> Filesystem features:      has_journal ext_attr resize_inode dir_index 
> filetype needs_recovery extent flex_bg sparse_super large_file 
> huge_file uninit_bg dir_nlink extra_isize
>
> Formatted *brick 2* with ext4:
>  [root at SERVER2 ~]# dumpe2fs /dev/sda| grep 'Filesystem features'
> dumpe2fs 1.41.12 (17-May-2010)
> Filesystem features:      has_journal ext_attr resize_inode dir_index 
> filetype extent flex_bg sparse_super large_file
>
> As you said i have disabled some of the features from *brick2*.
>
> I created a distribute volume with these two bricks. created some 
> files on the mount point and tried setting xattr for these files.
>
> I got error messages
> =======================================================================================
> [2011-12-30 01:57:22.551634] I 
> [client3_1-fops.c:818:client3_1_setxattr_cbk] 1-test-client-1: remote 
> operation failed: Operation not supported
> [2011-12-30 01:57:22.551658] W [fuse-bridge.c:850:fuse_err_cbk] 
> 0-glusterfs-fuse: 201305: SETXATTR() /92 => -1 (Operation not supported)
> [2011-12-30 01:57:22.556490] I 
> [client3_1-fops.c:818:client3_1_setxattr_cbk] 1-test-client-1: remote 
> operation failed: Operation not supported
> [2011-12-30 01:57:22.556520] W [fuse-bridge.c:850:fuse_err_cbk] 
> 0-glusterfs-fuse: 201311: SETXATTR() /95 => -1 (Operation not supported)
> [2011-12-30 01:57:22.564089] I 
> [client3_1-fops.c:818:client3_1_setxattr_cbk] 1-test-client-1: remote 
> operation failed: Operation not supported
> [2011-12-30 01:57:22.564114] W [fuse-bridge.c:850:fuse_err_cbk] 
> 0-glusterfs-fuse: 201321: SETXATTR() /100 => -1 (Operation not supported)
> ========================================================================================
>
> where  as i created another volume with only *brick1* and everything 
> went smoothly.
> so i suspect problem is not with rebalance but with ext4 features that 
> are disabled  on *brick2*.
>
> Please let me know if i am missing anything that can be tried.
>
>
>
>
> Thanks,
> Shylesh
>
>> ------------------------------------------------------------------------
>> *From:* gluster-users-bounces at gluster.org 
>> [gluster-users-bounces at gluster.org] on behalf of Dan Bretherton 
>> [d.a.bretherton at reading.ac.uk]
>> *Sent:* Thursday, December 29, 2011 6:05 AM
>> *To:* gluster-users
>> *Subject:* fix-layout stalls with xattr errors
>>
>> Hello All-
>> I am having problems with rebalance ... fix-layout in version 3.2.5. 
>>  I extended a volume with add-brick but the fix-layout stalls after a 
>> small number of layout fixes and does not make any more progress.  I 
>> have tried the operation twice on different servers with the same 
>> result.  The following errors are found in the fuse mount log file on 
>> the server carrying out the operation.
>>
>>     [2011-12-28 21:38:14.840013] I
>>     [afr-common.c:1038:afr_launch_self_heal] 0-nemo2-replicate-4:
>>     background  data self-heal triggered. path:
>>     /users/hzu/DATA/ERAINT/ORCA025/2010/snow_ERAINT_2010.nc
>>     [2011-12-28 21:38:15.93079] E
>>     [client3_1-fops.c:1498:client3_1_fxattrop_cbk] 0-nemo2-client-8:
>>     remote operation failed: Operation not supported
>>     [2011-12-28 21:38:15.93141] E
>>     [client3_1-fops.c:1498:client3_1_fxattrop_cbk] 0-nemo2-client-9:
>>     remote operation failed: Operation not supported
>>     [2011-12-28 21:38:15.93385] I
>>     [client3_1-fops.c:1187:client3_1_fstat_cbk] 0-nemo2-client-8:
>>     remote operation failed: Operation not supported
>>     [2011-12-28 21:38:15.93521] I
>>     [client3_1-fops.c:1187:client3_1_fstat_cbk] 0-nemo2-client-9:
>>     remote operation failed: Operation not supported
>>
>>
>> The file in the error message is a link, and it is not broken as seen 
>> from the volume mount point or the bricks.
>>
>> There are some worrying error messages in the brick log files for 
>> nemo2-client-8 and nemo2-client-9.  Here are some exerpts from the 
>> nemo2-client-8 log, which is similar to the 0-nemo2-client-9 log.
>>
>>     [2011-12-28 21:23:05.827877] W [posix.c:3928:do_xattrop]
>>     0-nemo2-posix: Extended attributes not supported by filesystem
>>     [2011-12-28 21:23:05.827932] I
>>     [server3_1-fops.c:1705:server_fxattrop_cbk] 0-nemo2-server: 8438:
>>     FXATTROP 0 (-2111276040) ==> -1 (Operation not support
>>     ed)
>>     [2011-12-28 21:23:05.828848] E [posix.c:4200:posix_fstat]
>>     0-nemo2-posix: fstat failed on fd=0x2aaaac703804: Operation not
>>     supported
>>     [2011-12-28 21:23:05.828879] I
>>     [server3_1-fops.c:1113:server_fstat_cbk] 0-nemo2-server: 8439:
>>     FSTAT 0 (-2111276040) ==> -1 (Operation not supported)
>>     [2011-12-28 21:29:29.871213] W
>>     [socket.c:1494:__socket_proto_state_machine] 0-tcp.nemo2-server:
>>     reading from socket failed. Error (Transport endpoint i
>>     s not connected), peer (192.171.166.81:1003)
>>     [2011-12-28 21:29:29.871305] I
>>     [server-helpers.c:360:do_lock_table_cleanup] 0-nemo2-server:
>>     inodelk released on /users/hzu/DATA/ERAINT/ORCA025/2010/sno
>>     w_ERAINT_2010.nc
>>     [2011-12-28 21:29:29.871345] I
>>     [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
>>     on /users/hzu/DATA/ERAINT/ORCA025/2010/snow_ERAINT_2010.
>>     nc
>>
>>     [2011-12-28 21:34:36.190023] I
>>     [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup on /
>>     [2011-12-28 21:34:36.190055] I
>>     [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
>>     on /users
>>     [2011-12-28 21:34:36.190086] I
>>     [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
>>     on /users/hzu
>>     [2011-12-28 21:34:36.190102] I
>>     [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
>>     on /users/hzu/DATA
>>     [2011-12-28 21:34:36.190135] I
>>     [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
>>     on /users/hzu/DATA/ERAINT
>>     [2011-12-28 21:34:36.190154] I
>>     [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
>>     on /users/hzu/DATA/ERAINT/ORCA025
>>     [2011-12-28 21:34:36.190171] I
>>     [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
>>     on /users/hzu/DATA/ERAINT/ORCA025/2009
>>
>>      [2011-12-28 21:38:15.92433] I
>>     [server3_1-fops.c:1705:server_fxattrop_cbk] 0-nemo2-server:
>>     12228: FXATTROP 7 (-2111276040) ==> -1 (Operation not supported)
>>     [2011-12-28 21:38:15.92743] E [posix.c:4200:posix_fstat]
>>     0-nemo2-posix: fstat failed on fd=0x2aaaac703804: Operation not
>>     supported
>>     [2011-12-28 21:38:15.92775] I
>>     [server3_1-fops.c:1113:server_fstat_cbk] 0-nemo2-server: 12229:
>>     FSTAT 7 (-2111276040) ==> -1 (Operation not supported)
>>
>>
>> The backend filesystems are ext4 and the are mounted with options 
>> "acl,user_xattr".  I tested extended attribute support (as suggested 
>> here: 
>> http://gluster.org/pipermail/gluster-users/2010-December/006257.html) 
>> and could not find any problems, so I don't understand the "Extended 
>> attributes not supported by filesystem" error.  The only unusual 
>> thing about the filesystems is the reduced number of filesystem 
>> features enabled compared to other bricks.  These are the ext4 
>> features enabled.
>>
>> has_journal ext_attr resize_inode dir_index filetype needs_recovery 
>> sparse_super large_file
>>
>> All the other bricks in the volume have these features plus extent, 
>> flex_bg, huge_file, uninit_bg, dir_nlink and extra_isize.  I don't 
>> know if any of these missing ext4 features are part of the problem. 
>>  Does anybody know what's going on here?
>>
>> Regards
>> Dan.
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gluster.org/pipermail/gluster-users/attachments/20111230/d69300aa/attachment-0001.htm>


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux