Re: brick multiplexing regression is broken

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 10/06/2017 11:08 AM, Mohit Agrawal wrote:
Without a patch test case will fail, it is an expected behavior.
When I said without patches, I meant it is failing on current HEAD on master which has the commit 9b4de61a136b8e5ba7bf0e48690cdb
1292d0dee8.
-Ravi


Regards
Mohit Agrawal

On Fri, Oct 6, 2017 at 11:04 AM, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:

The test is failing on master without any patches:

[root@tuxpad glusterfs]# prove tests/bugs/bug-1371806_1.t
tests/bugs/bug-1371806_1.t .. 7/9 setfattr: ./tmp1: No such file or directory
setfattr: ./tmp2: No such file or directory
setfattr: ./tmp3: No such file or directory
setfattr: ./tmp4: No such file or directory
setfattr: ./tmp5: No such file or directory
setfattr: ./tmp6: No such file or directory
setfattr: ./tmp7: No such file or directory
setfattr: ./tmp8: No such file or directory
setfattr: ./tmp9: No such file or directory
setfattr: ./tmp10: No such file or directory
./tmp1: user.foo: No such attribute
tests/bugs/bug-1371806_1.t .. Failed 2/9 subtests

Mount log for one of the directories:
[2017-10-06 05:32:10.059798] I [MSGID: 109005] [dht-selfheal.c:2458:dht_selfheal_directory] 0-patchy-dht: Directory selfheal failed: Unable to form layout for directory /tmp1
[2017-10-06 05:32:10.060013] E [MSGID: 109011] [dht-common.c:5011:dht_dir_common_setxattr] 0-patchy-dht: Failed to get mds subvol for path /tmp1gfid is 00000000-0000-0000-0000-000000000000
[2017-10-06 05:32:10.060041] W [fuse-bridge.c:1377:fuse_err_cbk] 0-glusterfs-fuse: 99: SETXATTR() /tmp1 => -1 (No such file or directory)

Request the patch authors to take a look at it.
Thanks
Ravi


On 10/05/2017 06:04 PM, Atin Mukherjee wrote:
The following commit has broken the brick multiplexing regression job. tests/bugs/bug-1371806_1.t has failed couple of times.  One of the latest regression job report is at https://build.gluster.org/job/regression-test-with-multiplex/406/console .


commit 9b4de61a136b8e5ba7bf0e48690cdb1292d0dee8
Author: Mohit Agrawal <moagrawa@xxxxxxxxxx>
Date:   Fri May 12 21:12:47 2017 +0530

    cluster/dht : User xattrs are not healed after brick stop/start
   
    Problem: In a distributed volume custom extended attribute value for a directory
             does not display correct value after stop/start or added newly brick.
             If any extended(acl) attribute value is set for a directory after stop/added
             the brick the attribute(user|acl|quota) value is not updated on brick
             after start the brick.
   
    Solution: First store hashed subvol or subvol(has internal xattr) on inode ctx and
              consider it as a MDS subvol.At the time of update custom xattr
              (user,quota,acl, selinux) on directory first check the mds from
              inode ctx, if mds is not present on inode ctx then throw EINVAL error
              to application otherwise set xattr on MDS subvol with internal xattr
              value of -1 and then try to update the attribute on other non MDS
              volumes also.If mds subvol is down in that case throw an
              error "Transport endpoint is not connected". In dht_dir_lookup_cbk|
              dht_revalidate_cbk|dht_discover_complete call dht_call_dir_xattr_heal
              to heal custom extended attribute.
              In case of gnfs server if hashed subvol has not found based on
              loc then wind a call on all subvol to update xattr.
   
    Fix:    1) Save MDS subvol on inode ctx
            2) Check if mds subvol is present on inode ctx
            3) If mds subvol is down then call unwind with error ENOTCONN and if it is up
               then set new xattr "GF_DHT_XATTR_MDS" to -1 and wind a call on other
               subvol.
            4) If setxattr fop is successful on non-mds subvol then increment the value of
               internal xattr to +1
            5) At the time of directory_lookup check the value of new xattr GF_DHT_XATTR_MDS
            6) If value is not 0 in dht_lookup_dir_cbk(other cbk) functions then call heal
               function to heal user xattr
            7) syncop_setxattr on hashed_subvol to reset the value of xattr to 0
               if heal is successful on all subvol.
   
    Test : To reproduce the issue followed below steps
           1) Create a distributed volume and create mount point
           2) Create some directory from mount point mkdir tmp{1..5}
           3) Kill any one brick from the volume
           4) Set extended attribute from mount point on directory
              setfattr -n user.foo -v "abc" ./tmp{1..5}
              It will throw error " Transport End point is not connected "
              for those hashed subvol is down
           5) Start volume with force option to start brick process
           6) Execute getfattr command on mount point for directory
           7) Check extended attribute on brick
              getfattr -n user.foo <volume-location>/tmp{1..5}
              It shows correct value for directories for those
              xattr fop were executed successfully.
   
    Note: The patch will resolve xattr healing problem only for fuse mount
          not for nfs mount.
   
    BUG: 1371806
    Signed-off-by: Mohit Agrawal <moagrawa@xxxxxxxxxx>
   
    Change-Id: I4eb137eace24a8cb796712b742f1d177a65343d5



_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel



_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux