Trying to debug an IO hanging situation

Daniel Berteaud <daniel@xxxxxxxxxxxxxxxxxxxxx> · Wed, 20 Apr 2016 16:13:23 +0200



    Hi.

    
    I've been trying to find out what's going on for several days now,
    but can't find anything myself, so I'm asking for some help with
    GlusterFS experts ;-)

    
    I'm running 3 replicated gluster volumes between 2 nodes (each node
    hosting 3 bricks: one per volume). Components involved:

    
    - CentOS 7.0 x86_64 / 3.10.0-123.20.1

    - GlusterFS 3.5.3

    
    (yes, I should upgrade, I know).

    
    This is used to host qemu-kvm VM. (1 GlusterFS volume for VM images,
    1 for libvirt locks, 1 for VM states, eg virsh save vm1 can be
    restored on the other node). The VM are hosted on the GlusterFS
    server itself (each node fuse-mount the storage volume on
    /var/lib/libvirt/images). So they are both GlusterFS server and
    client. VM are running only on the first node (but can be live
    migrated to the second one in case of problem).

    
    The 3 volumes (vmstore, save and locks) have the same configuration:

    
    [root@master1 ~]# gluster vol info vmstore

     
    Volume Name: vmstore

    Type: Replicate

    Volume ID: 7ed967f1-3b33-46d7-8908-0bb78c6e9199

    Status: Started

    Number of Bricks: 1 x 2 = 2

    Transport-type: tcp

    Bricks:

    Brick1: master1:/mnt/bricks/vmstore

    Brick2: master2:/mnt/bricks/vmstore

    Options Reconfigured:

    diagnostics.client-log-level: DEBUG

    diagnostics.brick-log-level: INFO

    cluster.eager-lock: on

    network.frame-timeout: 300

    network.ping-timeout: 20

    nfs.disable: on

    
    This setup worked well for more than a year, but had a big failure 3
    months ago: all my VM had a kernel panic because they couldn't
    access their storage anymore. Looking at my logs, I saw that gluster
    fuse client lost connection with both bricks because they had not
    responded for more than 5 sec (which was the network.ping-timeout at
    this time). I don't really understand how this could happen as the
    network was OK, and anyway, one of the bricks is running on
    127.0.0.1 so definitely not a network issue. I've increased
    network.ping-timeout to 20 sec, which allowed all my VM to be
    started again without connection to bricks being lost.

    
    Now, things are working, but since this day, I have random IO
    hanging from time to time. When the problem occurs, all IO in all
    the VM is hanged, the load on the hypervisor (which is also the
    GlusterFS client and one of the bricks) goes crazy (I've seen up to
    ~120). The load goes so high I can't do anything on the hypervisor,
    I loose my SSH access which doesn't respond anymore. The problem
    last for 5 or 10 minutes, then everything start working again (Some
    VM doesn't like being stuck for that long and need to be restarted).

    
    The problem is very random, can happen every 2 days, as everything
    can be working without a single issue for more than 3 weeks. It
    doesn't depend on the load, nor on the access pattern.

    
    I suspect something in Gluster to be the culprit, but I can't find
    anything. I've enabled DEBUG logging on the client (but not on the
    brick as it just too verbose), and will see if I can get more info
    next time the issue happens.

    
    I first noticed the problem always happened when I executed a
    monitoring script (which executed several gluster commands and
    parsed it's output to check the different volume status, script
    available here [1] if anyone is interested), but I've now completely
    disabled monitoring, and I still have this random issue.

    
    A strange thing I've noticed is that the main volume (the one
    storing the VM images) continuously shows files being healed if I
    look at:

    
    gluster vol heal vmstore info healed

    
    I see every 10 (exactly 10) minutes a few VM images being healed.
    But nothing in the client logs, nor the system loads indicate heal
    taking place. 

    
    I'm lost and don't know where to look, I'd really appreciate some
    help :-)

    
    (we're ready to hire a GlusterFS expert to help us sorting this out
    if necessary, this is a critical installation for us)

    
    [1]:
https://gitweb.firewall-services.com/?p=zabbix-agent-addons;a=blob_plain;f=zabbix_scripts/check_gluster_sudo;hb=HEAD

    -- 

      
              Daniel
                  Berteaud

                
                FIREWALL-SERVICES SAS.

                Société de Services en Logiciels Libres

                Tel : 05 56 64 15 32

                Visio : http://vroom.im/dani

                www.firewall-services.com  
          
        
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users