Re: Random and frequent split brain

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



log1 and log2 are brick logs. The others are client logs.

On Thu, Jul 17, 2014 at 8:08 AM, Pranith Kumar Karampuri
<pkarampu@xxxxxxxxxx> wrote:
>
> On 07/17/2014 07:28 AM, Nilesh Govindrajan wrote:
>>
>> On Thu, Jul 17, 2014 at 7:26 AM, Nilesh Govindrajan <me@xxxxxxxxxxxx>
>> wrote:
>>>
>>> Hello,
>>>
>>> I'm having a weird issue. I have this config:
>>>
>>> node2 ~ # gluster peer status
>>> Number of Peers: 1
>>>
>>> Hostname: sto1
>>> Uuid: f7570524-811a-44ed-b2eb-d7acffadfaa5
>>> State: Peer in Cluster (Connected)
>>>
>>> node1 ~ # gluster peer status
>>> Number of Peers: 1
>>>
>>> Hostname: sto2
>>> Port: 24007
>>> Uuid: 3a69faa9-f622-4c35-ac5e-b14a6826f5d9
>>> State: Peer in Cluster (Connected)
>>>
>>> Volume Name: home
>>> Type: Replicate
>>> Volume ID: 54fef941-2e33-4acf-9e98-1f86ea4f35b7
>>> Status: Started
>>> Number of Bricks: 1 x 2 = 2
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: sto1:/data/gluster/home
>>> Brick2: sto2:/data/gluster/home
>>> Options Reconfigured:
>>> performance.write-behind-window-size: 2GB
>>> performance.flush-behind: on
>>> performance.cache-size: 2GB
>>> cluster.choose-local: on
>>> storage.linux-aio: on
>>> transport.keepalive: on
>>> performance.quick-read: on
>>> performance.io-cache: on
>>> performance.stat-prefetch: on
>>> performance.read-ahead: on
>>> cluster.data-self-heal-algorithm: diff
>>> nfs.disable: on
>>>
>>> sto1/2 is alias to node1/2 respectively.
>>>
>>> As you see, NFS is disabled so I'm using the native fuse mount on both
>>> nodes.
>>> The volume contains files and php scripts that are served on various
>>> websites. When both nodes are active, I get split brain on many files
>>> and the mount on node2 going 'input/output error' on many of them
>>> which causes HTTP 500 errors.
>>>
>>> I delete the files from the brick using find -samefile. It fixes for a
>>> few minutes and then the problem is back.
>>>
>>> What could be the issue? This happens even if I use the NFS mounting
>>> method.
>>>
>>> Gluster 3.4.4 on Gentoo.
>>
>> And yes, network connectivity is not an issue between them as both of
>> them are located in the same DC. They're connected via 1 Gbit line
>> (common for internal and external network) but external network
>> doesn't cross 200-500 Mbit/s leaving quite a good window for gluster.
>> I also tried enabling quorum but that doesn't help either.
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users@xxxxxxxxxxx
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
> hi Nilesh,
>       Could you attach the mount, brick logs so that we can inspect what is
> going on the setup.
>
> Pranith

Attachment: log1.xz
Description: application/xz

Attachment: log2
Description: Binary data

Attachment: node1client.log.gz
Description: GNU Zip compressed data

Attachment: node2client.log.gz
Description: GNU Zip compressed data

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux