Self healing of 3.3.0 cause our 2 bricks replicated cluster freeze (client read/write timeout)

czhang.oss at gmail.com (ZHANG Cheng) · Thu, 29 Nov 2012 13:24:40 +0800

I dig out an gluster-users m-list thread dated 2011-June at
http://gluster.org/pipermail/gluster-users/2011-June/008111.html.

In this post, Marco Agostini said:
==================================================
Craig Carl said me, three days ago:
------------------------------------------------------
 that happens because Gluster's self heal is a blocking operation. We
are working on a non-blocking self heal, we are hoping to ship it in
early September.
------------------------------------------------------
==================================================

Looks like even with release of 3.3.1, self heal is still a blocking
operation. I am wondering why the official Administration Guide
doesn't warn the reader about such important thing regarding
production operation.

On Mon, Nov 26, 2012 at 5:46 PM, ZHANG Cheng <czhang.oss at gmail.com> wrote:
> Early this morning our 2 bricks replicated cluster had an outage. The
> disk space for one of the brick server (brick02) was used up. When we
> responded to the disk full alert, the issue already lasted for a few
> hours. We reclaimed some disk space, and reboot the brick02 server,
> expecting once it come back it will go self healing.
>
> It did go self healing, but just after couple minutes, access to
> gluster filesystem freeze. Tons of "nfs: server brick not responding,
> still trying" popped up in dmesg. The load average on app server went
> up to 200 something from usual 0.10. We had to shutdown brick02 server
> or stop gluster server process on it, to get the gluster cluster back
> working.
>
> How could we deal with this issue? Thanks in advance.
>
> Our gluster setup is followed the official doc.
>
> gluster> volume info
>
> Volume Name: staticvol
> Type: Replicate
> Volume ID: fdcbf635-5faf-45d6-ab4e-be97c74d7715
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: brick01:/exports/static
> Brick2: brick02:/exports/static
>
> Underlying filesystem is xfs (on a lvm volume), as:
> /dev/mapper/vg_node-brick on /exports/static type xfs
> (rw,noatime,nodiratime,nobarrier,logbufs=8)
>
> The brick servers don't act as gluster client.
>
> Our app servers are the gluster client, mount via nfs.
> brick:/staticvol on /mnt/gfs-static type nfs
> (rw,noatime,nodiratime,vers=3,rsize=8192,wsize=8192,addr=10.10.10.51)
>
> brick is a DNS round-robin record for brick01 and brick02.