Re: I/O error on replicated volume

Ravishankar N <ravishankar@xxxxxxxxxx> · Tue, 17 Mar 2015 10:05:05 +0530



    On 03/17/2015 02:14 AM, Jonathan Heese
      wrote:

    
        Hello,

          
          So I resolved my previous issue with split-brains and the lack
          of self-healing by dropping my installed glusterfs* packages
          from 3.6.2 to 3.5.3, but now I've picked up a new issue, which
          actually makes normal use of the volume practically
          impossible.

          
          A little background for those not already paying close
          attention:

          I have a 2 node 2 brick replicating volume whose purpose in
          life is to hold iSCSI target files, primarily for use to
          provide datastores to a VMware ESXi cluster.  The plan is to
          put a handful of image files on the Gluster volume, mount them
          locally on both Gluster nodes, and run tgtd on both, pointed
          to the image files on the mounted gluster volume. Then the
          ESXi boxes will use multipath (active/passive) iSCSI to
          connect to the nodes, with automatic failover in case of
          planned or unplanned downtime of the Gluster nodes.

          
          In my most recent round of testing with 3.5.3, I'm seeing a
          massive failure to write data to the volume after about 5-10
          minutes, so I've simplified the scenario a bit (to minimize
          the variables) to: both Gluster nodes up, only one node (duke)
          mounted and running tgtd, and just regular (single path) iSCSI
          from a single ESXi server.

          
          About 5-10 minutes into migration a VM onto the test
          datastore, /var/log/messages on duke gets blasted with a ton
          of messages exactly like this:

          Mar 15 22:24:06 duke tgtd: 
              bs_rdwr_request(180) io
            error 0x1781e00 2a -1 512 22971904, Input/output error
          

          And
            /var/log/glusterfs/mnt-gluster_disk.log gets blased with a
            ton of messages exactly like this:

          
          [2015-03-16 02:24:07.572279] W
            [fuse-bridge.c:2242:fuse_writev_cbk] 0-glusterfs-fuse:
            635299: WRITE => -1 (Input/output error)
          

    Are there any messages in the mount log from AFR about split-brain
    just before the above line appears?

    Does `gluster v heal <VOLNAME> info` show any files?
    Performing I/O on files that are in split-brain fail with EIO.

    
    -Ravi

    
          And the write operation from VMware's
            side fails as soon as these messages start.
          

          I don't see any other errors (in the log
            files I know of) indicating the root cause of these i/o
            errors.  I'm sure that this is not enough information to
            tell what's going on, but can anyone help me figure out what
            to look at next to figure this out?
          

          I've also considered using Dan
            Lambright's libgfapi gluster module for tgtd (or something
            similar) to avoid going through FUSE, but I'm not sure
            whether that would be irrelevant to this problem, since I'm
            not 100% sure if it lies in FUSE or elsewhere.
          

          Thanks!

          
          Jon
                  Heese

              Systems Engineer

              INetU Managed Hosting

              P:
                610.266.7441 x 261

              F:
                610.266.7434

              www.inetu.net
          ** This message
                  contains confidential information, which also may be
                  privileged, and is intended only for the person(s)
                  addressed above. Any unauthorized use, distribution,
                  copying or disclosure of confidential and/or
                  privileged information is strictly prohibited. If you
                  have received this communication in error, please
                  erase all copies of the message and its attachments
                  and notify the sender immediately via reply e-mail. **
          
           
      _______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
    
    
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users