We've a simple two-server one volume arrangement, replicating ~340k files (15GB) between our web servers.
The servers are in AWS, sat in different availability zones. One of the operations for this weekend is to add another pair of machines, one in each AZ.
I've deployed the same OS image of the gluster server (3.6) and was under the impression I could add a brick to the existing replica simply by issuing the below:
gluster volume add-brick volume1 replica 3 pd-wfe3:/gluster-store
And then presumably would add the fourth server by repeating the above with "replica 4" and the fourth server name.
The operation appeared to succeed, the brick appears alongside the others:
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: pd-wfe1:/gluster-store
Brick2: pd-wfe2:/gluster-store
Brick3: pd-wfe3:/gluster-store
but almost immediately pd-wfe1 crept up to 100% CPU with the gluster processes, and nginx began timing out serving content from the volume.
The glusterfs-glusterd-vol log is filled with this error at pd-wfe1:
[2016-01-23 08:43:28.459215] W [socket.c:620:__socket_rwv] 0-management: readv on /var/run/c8bc2f99e7584cb9cf077c4f98d1db2e.socket failed (Invalid argument)
while I see this error for the log named by the mount point:
[2016-01-23 08:43:28.986379] W [client-rpc-fops.c:306:client3_3_mkdir_cbk] 2-volume1-client-2: remote operation failed: Permission denied. Path: (null)
Does anyone have any suggestions how to proceed? I would appreciate any input on this one.
Steve
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users