Re: Gluster management commands gives error

Atin Mukherjee <amukherj@xxxxxxxxxx> · Wed, 30 Mar 2016 15:22:28 +0530



On 03/30/2016 02:49 PM, Serkan Çoban wrote:
> I think the issue happens because none of gluster v status v0
> [mem|clients|..] commands work on my cluster.
This is a known issue :(
If you have quite a number of bricks (in your case so) it incurs
brick-op RPCs for every bricks and that's why it takes time to finish
executing the command and by the time cli timeout happens.
> Those commands give Error request timeout and never outputs anything.
> Maybe because of brick count (1560) and client count (60) bt somehow
> they continue in the background.
> After some time I can run normal gluster volume status|get|set
> commands but again when I try to run gluster v status v0
> [mem|clients|..] it gives error timeout.
> Gluster op version is 30707 this is a fresh 3.7.9 install.
> 
> On Tue, Mar 29, 2016 at 3:53 PM, Atin Mukherjee <amukherj@xxxxxxxxxx> wrote:
>>
>>
>> On 03/29/2016 05:34 PM, Serkan Çoban wrote:
>>> Hi, I am on 3.7.9 and currently none of the gluster commands (gluster
>>> peer status, g;uster volume status) works. I have below lines in the
>>> logs:
>>>
>>> [2016-03-29 11:25:23.878845] W
>>> [glusterd-locks.c:692:glusterd_mgmt_v3_unlock]
>>> (-->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c)
>>> [0x7fdbd9c53eec]
>>> -->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162)
>>> [0x7fdbd9c5e432]
>>> -->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x37a)
>>> [0x7fdbd9cff0ca] ) 0-management: Lock owner mismatch. Lock for vol v0
>>> held by c95f1a84-761e-493a-a462-54fcd6d72122
>>> [2016-03-29 11:27:39.310977] I [MSGID: 106499]
>>> [glusterd-handler.c:4329:__glusterd_handle_status_volume]
>>> 0-management: Received status volume req for volume v0
>>> [2016-03-29 11:27:39.312654] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
>>> failed on h1.domain.net. Please check log file for details.
>> Check if you see any error message(s) in glusterd log on h1.domain.net.
>> By any chance are you running a script in background which triggers
>> multiple commands on the same volume concurrently, then its expected.
>> Also is your cluster op-version (gluster volume get <volname>
>> cluster.op-version> is up to date?
>>
>>> [2016-03-29 11:27:39.312755] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
>>> failed on h2.domain.net. Please check log file for details.
>>> [2016-03-29 11:27:39.312781] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
>>> failed on h3.domain.net. Please check log file for details.
>>> [2016-03-29 11:27:39.312821] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
>>> failed on h4.domain.net. Please check log file for details.
>>> [2016-03-29 11:27:39.312849] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
>>> failed on h44.domain.net. Please check log file for details.
>>> [2016-03-29 11:27:39.312879] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
>>> failed on h35.domain.net. Please check log file for details.
>>> [2016-03-29 11:27:39.312920] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
>>> failed on h22.domain.net. Please check log file for details.
>>> [2016-03-29 11:27:39.312950] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
>>> failed on h11.domain.net. Please check log file for details.
>>> [2016-03-29 11:27:39.312981] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
>>> failed on h31.domain.net. Please check log file for details.
>>> [2016-03-29 11:27:39.313077] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking
>>> failed on h33.domain.net. Please check log file for details.
>>> [2016-03-29 11:27:46.490894] E [MSGID: 106151]
>>> [glusterd-syncop.c:1868:gd_sync_task_begin] 0-management: Locking
>>> Peers Failed.
>>>
>>> What can I do to solve the problem? command sometimes gives timeout
>>> error and sometimes give locking failed on host...
>>>
>>> Serkan
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users@xxxxxxxxxxx
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
> 
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users