Not real confident in 3.3

B.Candler at pobox.com (Brian Candler) · Sun, 17 Jun 2012 08:42:12 +0100

On Sat, Jun 16, 2012 at 04:47:51PM -0400, Sean Fulton wrote:
> 1) The split-brain message is strange because there are only two
> server nodes and 1 client node which has mounted the volume via NFS
> on a floating IP. This was done to guarantee that only one node gets
> written to at any point in time, so there is zero chance that two
> nodes were updated simultaneously.

Are you using a distributed volume, or a replicated volume? Writes to a
replicated volume go to both nodes.

>    [586898.273283] INFO: task flush-0:45:633954 blocked for more than 120 seconds.
>    [586898.273290] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>    [586898.273295] flush-0:45    D ffff8806037592d0     0 633954      20 0x00000000
>    [586898.273304]  ffff88000d1ebbe0 0000000000000046 ffff88000d1ebd6c 0000000000000000
>    [586898.273312]  ffff88000d1ebce0 ffffffff81054444 ffff88000d1ebc80 ffff88000d1ebbf0
>    [586898.273319]  ffff8806050ac5f8 ffff880603759888 ffff88000d1ebfd8 ffff88000d1ebfd8
>    [586898.273326] Call Trace:
>    [586898.273335]  [<ffffffff81054444>] ? find_busiest_group+0x244/0xb20
>    [586898.273343]  [<ffffffff811ab050>] ? inode_wait+0x0/0x20
>    [586898.273349]  [<ffffffff811ab05e>] inode_wait+0xe/0x20

Are you using XFS by any chance?

I started with XFS, because that was what the gluster docs recommend, but
eventually gave up on it.  I can replicate those sort of kernel lockups on a
24-disk MD array within a short space of time - without gluster, just by
throwing four bonnie++ processes at it.

The same tests run with either ext4 or btrfs do not hang, at least not
during two days of continuous testing.

Of course, any kernel problem cannot be the fault of glusterfs, since
glusterfs runs entirely in userland.

Regards,

Brian.