Re: brick multiplexing regression is broken

Mohit Agrawal <moagrawa@xxxxxxxxxx> · Fri, 6 Oct 2017 11:22:37 +0530

Hi,
  Thanks for clarify it. I am already looking it, I will upload a new patch soon to resolve the same.

Regards
Mohit Agrawal

On Fri, Oct 6, 2017 at 11:14 AM, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:

    On 10/06/2017 11:08 AM, Mohit Agrawal
      wrote:

      Without a patch test case will fail, it is an
        expected behavior.

    When I said without patches, I meant it is failing on current HEAD
    on master which has the commit 9b4de61a136b8e5ba7bf0e48690cdb
    1292d0dee8. 

      -Ravi

        Regards
        Mohit Agrawal

        On Fri, Oct 6, 2017 at 11:04 AM,
          Ravishankar N <ravishankar@xxxxxxxxxx>
          wrote:

              The test is failing on master without any patches:
              [root@tuxpad glusterfs]# prove
                tests/bugs/bug-1371806_1.t

              tests/bugs/bug-1371806_1.t .. 7/9 setfattr:
                ./tmp1: No such file or directory

              setfattr: ./tmp2: No such file or directory

              setfattr: ./tmp3: No such file or directory

              setfattr: ./tmp4: No such file or directory

              setfattr: ./tmp5: No such file or directory

              setfattr: ./tmp6: No such file or directory

              setfattr: ./tmp7: No such file or directory

              setfattr: ./tmp8: No such file or directory

              setfattr: ./tmp9: No such file or directory

              setfattr: ./tmp10: No such file or directory

              ./tmp1: user.foo: No such attribute

              tests/bugs/bug-1371806_1.t .. Failed 2/9 subtests

              Mount log for one of the directories:

              [2017-10-06 05:32:10.059798] I [MSGID: 109005]
                [dht-selfheal.c:2458:dht_selfheal_directory]
                0-patchy-dht: Directory selfheal failed: Unable to form
                layout for directory /tmp1

              [2017-10-06 05:32:10.060013] E [MSGID: 109011]
                [dht-common.c:5011:dht_dir_common_setxattr]
                0-patchy-dht: Failed to get mds subvol for path
                /tmp1gfid is 00000000-0000-0000-0000-000000000000

              [2017-10-06 05:32:10.060041] W
                [fuse-bridge.c:1377:fuse_err_cbk] 0-glusterfs-fuse:
                99: SETXATTR() /tmp1 => -1 (No such file or
                directory)

              Request the patch authors to take a look at it.

              Thanks

              Ravi

                  On
                    10/05/2017 06:04 PM, Atin Mukherjee wrote:

                    The following commit has broken the
                      brick multiplexing regression job.
                      tests/bugs/bug-1371806_1.t has failed couple of
                      times.  One of the latest regression job report is
                      at https://build.gluster.org/job/regression-test-with-multiplex/406/console
                      .

                      commit 9b4de61a136b8e5ba7bf0e48690cdb1292d0dee8

                      Author: Mohit Agrawal <moagrawa@xxxxxxxxxx>

                      Date:   Fri May 12 21:12:47 2017 +0530

                          cluster/dht : User xattrs are not healed after
                      brick stop/start

                          Problem: In a distributed volume custom
                      extended attribute value for a directory

                                   does not display correct value after
                      stop/start or added newly brick.

                                   If any extended(acl) attribute value
                      is set for a directory after stop/added

                                   the brick the
                      attribute(user|acl|quota) value is not updated on
                      brick

                                   after start the brick.

                          Solution: First store hashed subvol or
                      subvol(has internal xattr) on inode ctx and

                                    consider it as a MDS subvol.At the
                      time of update custom xattr

                                    (user,quota,acl, selinux) on
                      directory first check the mds from

                                    inode ctx, if mds is not present on
                      inode ctx then throw EINVAL error

                                    to application otherwise set xattr
                      on MDS subvol with internal xattr

                                    value of -1 and then try to update
                      the attribute on other non MDS

                                    volumes also.If mds subvol is down
                      in that case throw an

                                    error "Transport endpoint is not
                      connected". In dht_dir_lookup_cbk|

                                    dht_revalidate_cbk|dht_discover_complete
                      call dht_call_dir_xattr_heal

                                    to heal custom extended attribute.

                                    In case of gnfs server if hashed
                      subvol has not found based on

                                    loc then wind a call on all subvol
                      to update xattr.

                          Fix:    1) Save MDS subvol on inode ctx

                                  2) Check if mds subvol is present on
                      inode ctx

                                  3) If mds subvol is down then call
                      unwind with error ENOTCONN and if it is up

                                     then set new xattr
                      "GF_DHT_XATTR_MDS" to -1 and wind a call on other

                                     subvol.

                                  4) If setxattr fop is successful on
                      non-mds subvol then increment the value of

                                     internal xattr to +1

                                  5) At the time of directory_lookup
                      check the value of new xattr GF_DHT_XATTR_MDS

                                  6) If value is not 0 in
                      dht_lookup_dir_cbk(other cbk) functions then call
                      heal

                                     function to heal user xattr

                                  7) syncop_setxattr on hashed_subvol to
                      reset the value of xattr to 0

                                     if heal is successful on all
                      subvol.

                          Test : To reproduce the issue followed below
                      steps

                                 1) Create a distributed volume and
                      create mount point

                                 2) Create some directory from mount
                      point mkdir tmp{1..5}

                                 3) Kill any one brick from the volume

                                 4) Set extended attribute from mount
                      point on directory

                                    setfattr -n user.foo -v "abc"
                      ./tmp{1..5}

                                    It will throw error " Transport End
                      point is not connected "

                                    for those hashed subvol is down

                                 5) Start volume with force option to
                      start brick process

                                 6) Execute getfattr command on mount
                      point for directory

                                 7) Check extended attribute on brick

                                    getfattr -n user.foo
                      <volume-location>/tmp{1..5}

                                    It shows correct value for
                      directories for those

                                    xattr fop were executed
                      successfully.

                          Note: The patch will resolve xattr healing
                      problem only for fuse mount

                                not for nfs mount.

                          BUG: 1371806

                          Signed-off-by: Mohit Agrawal <moagrawa@xxxxxxxxxx>

                          Change-Id: I4eb137eace24a8cb796712b742f1d177a65343d5

                _______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel