Hi Ashish,There was heavy IO load on the cluster when it get locked down. I fear the process waiting for IO will all get crashed.
Furthermore, both "start force" and "stop" told me "Error : Request timed out". I'm not sure if it was caused by the semi-dead node. I'll hard reset the node tomorrow and see if it helps.
Besides, what caused the lock and how can I avoid it? Any advice is appreciated.
Best wishes, Chen On 4/4/2016 6:11 PM, Ashish Pandey wrote:
Hi Chen, As I suspected, there are many blocked call for inodelk in sm11/mnt-disk1-mainvol.31115.dump.1459760675. ============================================= [xlator.features.locks.mainvol-locks.inode] path=/home/analyzer/softs/bin/GenomeAnalysisTK.jar mandatory=0 inodelk-count=4 lock-dump.domain.domain=mainvol-disperse-0:self-heal lock-dump.domain.domain=mainvol-disperse-0 inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid = 1, owner=dc2d3dfcc57f0000, client=0x7ff03435d5f0, connection-id=sm12-8063-2016/04/01-07:51:46:892384-mainvol-client-0-0-0, blocked at 2016-04-01 16:52:58, granted at 2016-04-01 16:52:58 inodelk.inodelk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 1, owner=1414371e1a7f0000, client=0x7ff034204490, connection-id=hw10-17315-2016/04/01-07:51:44:421807-mainvol-client-0-0-0, blocked at 2016-04-01 16:58:51 inodelk.inodelk[2](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 1, owner=a8eb14cd9b7f0000, client=0x7ff01400dbd0, connection-id=sm14-879-2016/04/01-07:51:56:133106-mainvol-client-0-0-0, blocked at 2016-04-01 17:03:41 inodelk.inodelk[3](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 1, owner=b41a0482867f0000, client=0x7ff01800e670, connection-id=sm15-30906-2016/04/01-07:51:45:711474-mainvol-client-0-0-0, blocked at 2016-04-01 17:05:09 ============================================= This could be the cause of hang. Possible Workaround - If there is no IO going on for this volume, we can restart the volume using - gluster v start <volume-name> force. This will restart the nfs process too which will release the locks and we could come out of this issue. Ashish ----- Original Message ----- From: "Chen Chen" <chenchen@xxxxxxxxxxxxxxxx> To: "Ashish Pandey" <aspandey@xxxxxxxxxx> Cc: gluster-users@xxxxxxxxxxx Sent: Monday, April 4, 2016 2:56:37 PM Subject: Re: Need some help on Mismatching xdata / Failed combine iatt / Too many fd Hi Ashish, Yes, I only uploaded the directory of one node (sm11). All nodes are showing the same kind of errors at the same time more or less. I'm sending the infos of the other 5 nodes. Logs of all bricks (except the "dead" 1x2) are also appended. One of the node (sm16) refused to let me ssh into it. volume status said it is still alive and showmount on it is working too. The node "hw10" works as a pure NFS server and don't have any bricks. The dump file and logs are again in my Dropbox (3.8M) https://dl.dropboxusercontent.com/u/56671522/statedump.tar.xz Best wishes, Chen On 4/4/2016 4:27 PM, Ashish Pandey wrote:Hi Chen, By looking at log in mnt-disk1-mainvol.log and mnt-disk1-mainvol.log I suspect this hang is because of inode lock contention. I think the log provided are for one brick only. To make sure of it, we would require statedump for all the brick process and nfs For bricks: gluster volume statedump <volname> For nfs server: gluster volume statedump <volname> nfs Directory where statedump files are created can be find by using 'gluster --print-statedumpdir' command. If not present create this directory. Logs for all the bricks are also required. You should try to restart the volume which could solve this hang issue if this is because of inode lock. gluster volume start <volname> force Ashish
-- Chen Chen 上海慧算生物技术有限公司 Shanghai SmartQuerier Biotechnology Co., Ltd. Add: Room 410, 781 Cai Lun Road, China (Shanghai) Pilot Free Trade Zone Shanghai 201203, P. R. China Mob: +86 15221885893 Email: chenchen@xxxxxxxxxxxxxxxx Web: www.smartquerier.com
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users