Hi Ashish Pandey,After some investigation I updated the server from 3.7.6 to 3.7.9. I also switched from native fuse to NFS mount (which boosted the performance a lot when I tested) on April 1st.
Then after two days' running, the cluster appeared to be locked. "ls" hangs, no network usage, volume profile showed no r/w activity on bricks. "dmesg" showed the NFS went dead in 12 hrs (Apr 2 01:13), but "showmount" and "volume status" said NFS server is responding and all bricks are alive.
I'm not sure what had happened (glustershd.log and nfs.log didn't show anything interesting), so I dumped the whole log folder instead. It was a bit too large (5MB, filled by Error and Warning) and my mail was rejected multiple times by the mailing list. I can only attached the snapshot of all logs. You can grab the full version at https://dl.dropboxusercontent.com/u/56671522/glusterfs.tar.xz instead.
The volume profile info is also attached. Hope it helps. Best wishes, Chen On 3/27/2016 2:38 AM, Ashish Pandey wrote:
Hi Chen, Could you please send us following logs- 1 - brick logs - under /var/log/messages/brick/ 2 - mount logs Also some information like what kind of IO was happening (read,write, unlink, rename on different mount) to understand this issue in a better way. --- Ashish ----- Original Message ----- From: "陈陈" <chenchen@xxxxxxxxxxxxxxxx> To: gluster-users@xxxxxxxxxxx Sent: Friday, March 25, 2016 8:59:04 AM Subject: Need some help on Mismatching xdata / Failed combine iatt / Too many fd Hi Everyone, I have a "2 x (4 + 2) = 12 Distributed-Disperse" volume. After upgraded to 3.7.8 I noticed the volume is frequently out of service. The glustershd.log is flooded by: [ec-combine.c:866:ec_combine_check] 0-mainvol-disperse-1: Mismatching xdata in answers of 'LOOKUP'" [ec-common.c:116:ec_check_status] 0-mainvol-disperse-1: Operation failed on some subvolumes (up=3F, mask=3F, remaining=0, good=1E, bad=21) [ec-common.c:71:ec_heal_report] 0-mainvol-disperse-1: Heal failed [Invalid argument] [ec-combine.c:206:ec_iatt_combine] 0-mainvol-disperse-0: Failed to combine iatt (inode: xxx, links: 1-1, uid: 1000-1000, gid: 1000-1000, rdev: 0-0, size: xxx-xxx, mode: 100600-100600) in normal working state, and sometimes 1000+ lines of: [client-rpc-fops.c:466:client3_3_open_cbk] 0-mainvol-client-7: remote operation failed. Path: <gfid:xxxx> (xxxx) [Too many open files] and the brick went offline. "top open" showed "Max open fds: 899195". Can anyone suggest me what happened, and what should I do? I was trying to deal with the terrible IOPS problem but things got even worse. Each Server has 2 x E5-2630v3 (32threads/server), 32GB RAM. Additional infos are in the attachements. Many thanks. Sincerely yours, Chen
-- Chen Chen 上海慧算生物技术有限公司 Shanghai SmartQuerier Biotechnology Co., Ltd. Add: Room 410, 781 Cai Lun Road, China (Shanghai) Pilot Free Trade Zone Shanghai 201203, P. R. China Mob: +86 15221885893 Email: chenchen@xxxxxxxxxxxxxxxx Web: www.smartquerier.com
Brick: sm16:/mnt/disk2/mainvol ------------------------------ Cumulative Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 37968 18257117 657317 No. of Writes: 27442 4436 607 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 19407384 3134980 3417081 No. of Writes: 641 1008 1217 Block Size: 32768b+ 65536b+ No. of Reads: 1028960 9867913 No. of Writes: 6889 20508938 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 529 FORGET 0.00 0.00 us 0.00 us 0.00 us 81095 RELEASE 0.00 0.00 us 0.00 us 0.00 us 15748 RELEASEDIR Duration: 167705 seconds Data Read: 869764755456 bytes Data Written: 1344596574720 bytes Interval 1 Stats: Duration: 255 seconds Data Read: 0 bytes Data Written: 0 bytes Brick: sm16:/mnt/disk1/mainvol ------------------------------ Cumulative Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 25731 124811 62170 No. of Writes: 25591 5235 539 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 1780063 41332 1599410 No. of Writes: 668 901 1155 Block Size: 32768b+ 65536b+ No. of Reads: 597009 7867435 No. of Writes: 7347 18906027 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 500 FORGET 0.00 0.00 us 0.00 us 0.00 us 2585213 RELEASE 0.00 0.00 us 0.00 us 0.00 us 15757 RELEASEDIR Duration: 167705 seconds Data Read: 572226195968 bytes Data Written: 1239575955968 bytes Interval 1 Stats: Duration: 255 seconds Data Read: 0 bytes Data Written: 0 bytes Brick: sm11:/mnt/disk1/mainvol ------------------------------ Cumulative Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 38428 18330601 659152 No. of Writes: 27442 4436 607 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 19680537 3186133 3557387 No. of Writes: 641 1008 1217 Block Size: 32768b+ 65536b+ No. of Reads: 961274 10006889 No. of Writes: 6889 20508938 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 529 FORGET 0.00 0.00 us 0.00 us 0.00 us 81097 RELEASE 0.00 0.00 us 0.00 us 0.00 us 15742 RELEASEDIR Duration: 167705 seconds Data Read: 880603889664 bytes Data Written: 1344596574720 bytes Interval 1 Stats: Duration: 255 seconds Data Read: 0 bytes Data Written: 0 bytes Brick: sm11:/mnt/disk2/mainvol ------------------------------ Cumulative Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 26415 118603 62244 No. of Writes: 25591 5235 539 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 1851055 41928 1466117 No. of Writes: 668 901 1155 Block Size: 32768b+ 65536b+ No. of Reads: 641012 7944255 No. of Writes: 7347 18906027 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 500 FORGET 0.00 0.00 us 0.00 us 0.00 us 2585238 RELEASE 0.00 0.00 us 0.00 us 0.00 us 15755 RELEASEDIR Duration: 167705 seconds Data Read: 576850006016 bytes Data Written: 1239575955968 bytes Interval 1 Stats: Duration: 255 seconds Data Read: 0 bytes Data Written: 0 bytes Brick: sm14:/mnt/disk2/mainvol ------------------------------ Cumulative Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 37320 17789029 655061 No. of Writes: 27442 4436 607 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 19600027 3110591 3185336 No. of Writes: 641 1008 1217 Block Size: 32768b+ 65536b+ No. of Reads: 1043031 9626406 No. of Writes: 6889 20508938 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 529 FORGET 0.00 0.00 us 0.00 us 0.00 us 81097 RELEASE 0.00 0.00 us 0.00 us 0.00 us 15744 RELEASEDIR Duration: 167705 seconds Data Read: 850640217600 bytes Data Written: 1344596574720 bytes Interval 1 Stats: Duration: 255 seconds Data Read: 0 bytes Data Written: 0 bytes Brick: sm14:/mnt/disk1/mainvol ------------------------------ Cumulative Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 25430 118856 63730 No. of Writes: 25591 5235 539 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 1896584 24957 1611272 No. of Writes: 668 901 1155 Block Size: 32768b+ 65536b+ No. of Reads: 687858 7551537 No. of Writes: 7347 18906027 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 500 FORGET 0.00 0.00 us 0.00 us 0.00 us 2585228 RELEASE 0.00 0.00 us 0.00 us 0.00 us 15753 RELEASEDIR Duration: 167704 seconds Data Read: 554862222336 bytes Data Written: 1239575955968 bytes Interval 1 Stats: Duration: 255 seconds Data Read: 0 bytes Data Written: 0 bytes Brick: sm13:/mnt/disk1/mainvol ------------------------------ Cumulative Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 37852 18005136 657018 No. of Writes: 27442 4436 607 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 19471162 3132895 3154404 No. of Writes: 641 1008 1217 Block Size: 32768b+ 65536b+ No. of Reads: 1018312 9702965 No. of Writes: 6889 20508938 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 529 FORGET 0.00 0.00 us 0.00 us 0.00 us 81097 RELEASE 0.00 0.00 us 0.00 us 0.00 us 15741 RELEASEDIR Duration: 167705 seconds Data Read: 854245903872 bytes Data Written: 1344596574720 bytes Interval 1 Stats: Duration: 255 seconds Data Read: 0 bytes Data Written: 0 bytes Brick: sm13:/mnt/disk2/mainvol ------------------------------ Cumulative Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 25869 100519 63939 No. of Writes: 25591 5235 539 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 1853310 41271 1394883 No. of Writes: 668 901 1155 Block Size: 32768b+ 65536b+ No. of Reads: 576972 7517939 No. of Writes: 7347 18906027 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 500 FORGET 0.00 0.00 us 0.00 us 0.00 us 2585248 RELEASE 0.00 0.00 us 0.00 us 0.00 us 15754 RELEASEDIR Duration: 167705 seconds Data Read: 545438357504 bytes Data Written: 1239575955968 bytes Interval 1 Stats: Duration: 255 seconds Data Read: 0 bytes Data Written: 0 bytes Brick: sm15:/mnt/disk1/mainvol ------------------------------ Cumulative Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 25376 124010 62769 No. of Writes: 25591 5235 539 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 1842626 25247 1747332 No. of Writes: 668 901 1155 Block Size: 32768b+ 65536b+ No. of Reads: 615409 7695723 No. of Writes: 7347 18906027 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 500 FORGET 0.00 0.00 us 0.00 us 0.00 us 2585252 RELEASE 0.00 0.00 us 0.00 us 0.00 us 15752 RELEASEDIR Duration: 167705 seconds Data Read: 564089530880 bytes Data Written: 1239575955968 bytes Interval 1 Stats: Duration: 255 seconds Data Read: 0 bytes Data Written: 0 bytes Brick: sm15:/mnt/disk2/mainvol ------------------------------ Cumulative Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 37794 17969276 655026 No. of Writes: 27442 4436 607 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 19297777 3087656 3290762 No. of Writes: 641 1008 1217 Block Size: 32768b+ 65536b+ No. of Reads: 1025743 9707300 No. of Writes: 6889 20508938 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 529 FORGET 0.00 0.00 us 0.00 us 0.00 us 81097 RELEASE 0.00 0.00 us 0.00 us 0.00 us 15742 RELEASEDIR Duration: 167705 seconds Data Read: 855877165568 bytes Data Written: 1344596574720 bytes Interval 1 Stats: Duration: 255 seconds Data Read: 0 bytes Data Written: 0 bytes Brick: sm12:/mnt/disk2/mainvol ------------------------------ Cumulative Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 26499 99466 63056 No. of Writes: 25591 5235 539 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 1870342 42157 1397655 No. of Writes: 668 901 1155 Block Size: 32768b+ 65536b+ No. of Reads: 548533 7738956 No. of Writes: 7347 18906027 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 500 FORGET 0.00 0.00 us 0.00 us 0.00 us 2585231 RELEASE 0.00 0.00 us 0.00 us 0.00 us 15751 RELEASEDIR Duration: 167706 seconds Data Read: 559290661888 bytes Data Written: 1239575955968 bytes Interval 1 Stats: Duration: 256 seconds Data Read: 0 bytes Data Written: 0 bytes Brick: sm12:/mnt/disk1/mainvol ------------------------------ Cumulative Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 38786 18260049 659154 No. of Writes: 27442 4436 607 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 19442314 3161210 3400222 No. of Writes: 641 1008 1217 Block Size: 32768b+ 65536b+ No. of Reads: 933426 9923716 No. of Writes: 6889 20508938 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 529 FORGET 0.00 0.00 us 0.00 us 0.00 us 81097 RELEASE 0.00 0.00 us 0.00 us 0.00 us 21405 RELEASEDIR 0.75 2.66 us 2.00 us 3.00 us 35 OPENDIR 14.07 49.86 us 23.00 us 92.00 us 35 LOOKUP 33.15 58.74 us 20.00 us 116.00 us 70 READDIR 52.04 92.23 us 20.00 us 921.00 us 70 GETXATTR Duration: 167706 seconds Data Read: 870389523968 bytes Data Written: 1344596574720 bytes Interval 1 Stats: %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 32 RELEASEDIR 0.76 2.69 us 2.00 us 3.00 us 32 OPENDIR 13.85 49.22 us 23.00 us 92.00 us 32 LOOKUP 32.81 58.28 us 20.00 us 116.00 us 64 READDIR 52.59 93.42 us 20.00 us 921.00 us 64 GETXATTR Duration: 256 seconds Data Read: 0 bytes Data Written: 0 bytes
[root@sm11 glusterfs]# tail bricks/*.log ==> bricks/mnt-disk1-mainvol.log <== [2016-04-01 12:25:33.612779] E [MSGID: 115056] [server-rpc-fops.c:689:server_opendir_cbk] 0-mainvol-server: 10971356: OPENDIR /home/analyzer/personal/tcliu/projects/NTD/case_vcfs/HIGH/GALNT11 (e49e2adf-dc3f-41f5-96d5-b14b40f35d5f) ==> (Permission denied) [Permission denied] [2016-04-01 12:29:46.857938] E [MSGID: 113018] [posix.c:234:posix_lookup] 0-mainvol-posix: post-operation lstat on parent /mnt/disk1/mainvol/.glusterfs/f3/83/f3833a3a-6c47-415d-ad1b-f3c6a7a57681 failed [No such file or directory] [2016-04-01 12:29:46.859504] E [MSGID: 113018] [posix.c:234:posix_lookup] 0-mainvol-posix: post-operation lstat on parent /mnt/disk1/mainvol/.glusterfs/f3/83/f3833a3a-6c47-415d-ad1b-f3c6a7a57681 failed [No such file or directory] [2016-04-01 12:33:45.228956] E [MSGID: 113001] [posix.c:5194:_posix_handle_xattr_keyvalue_pair] 0-mainvol-posix: getxattr failed on /mnt/disk1/mainvol/.glusterfs/3b/50/3b50d2e8-4956-4f96-ac19-3053a04bb676 while doing xattrop: Key:trusted.ec.version [No such file or directory] [2016-04-01 14:31:38.579476] E [MSGID: 113001] [posix.c:5194:_posix_handle_xattr_keyvalue_pair] 0-mainvol-posix: getxattr failed on /mnt/disk1/mainvol/.glusterfs/6b/b4/6bb472bd-df7a-47ce-8b9b-54f859e72b15 while doing xattrop: Key:trusted.ec.version [No such file or directory] [2016-04-01 15:20:17.888807] E [MSGID: 113001] [posix.c:5194:_posix_handle_xattr_keyvalue_pair] 0-mainvol-posix: getxattr failed on /mnt/disk1/mainvol/.glusterfs/ab/9e/ab9e50a7-e233-4398-a459-3146d46554bf while doing xattrop: Key:trusted.ec.version [No such file or directory] [2016-04-01 15:22:29.297448] E [MSGID: 113001] [posix.c:5194:_posix_handle_xattr_keyvalue_pair] 0-mainvol-posix: getxattr failed on /mnt/disk1/mainvol/.glusterfs/e7/d5/e7d5cb3c-6a47-45c9-8e90-d6863ba392ba while doing xattrop: Key:trusted.ec.version [No such file or directory] [2016-04-01 16:30:42.257752] E [MSGID: 113001] [posix.c:5194:_posix_handle_xattr_keyvalue_pair] 0-mainvol-posix: getxattr failed on /mnt/disk1/mainvol/.glusterfs/4e/a4/4ea4bea2-b2fa-4a8c-9b28-f085edac24bb while doing xattrop: Key:trusted.ec.version [No such file or directory] [2016-04-01 16:30:42.257885] E [MSGID: 113001] [posix.c:5194:_posix_handle_xattr_keyvalue_pair] 0-mainvol-posix: getxattr failed on /mnt/disk1/mainvol/.glusterfs/35/6b/356b1bd0-38e2-4a38-98aa-d070573b8ff2 while doing xattrop: Key:trusted.ec.version [No such file or directory] [2016-04-01 16:37:55.570342] E [MSGID: 113001] [posix.c:5194:_posix_handle_xattr_keyvalue_pair] 0-mainvol-posix: getxattr failed on /mnt/disk1/mainvol/.glusterfs/42/69/426917ac-590b-4046-a6c8-f6b552d56c88 while doing xattrop: Key:trusted.ec.version [No such file or directory] ==> bricks/mnt-disk2-mainvol.log <== [2016-04-01 08:42:09.301205] E [MSGID: 113001] [posix.c:5194:_posix_handle_xattr_keyvalue_pair] 0-mainvol-posix: getxattr failed on /mnt/disk2/mainvol/.glusterfs/8e/dd/8edd9615-5efd-4ff6-a3bd-fd6847588b03 while doing xattrop: Key:trusted.ec.version [No such file or directory] [2016-04-01 08:42:46.330855] E [MSGID: 113001] [posix.c:5194:_posix_handle_xattr_keyvalue_pair] 0-mainvol-posix: getxattr failed on /mnt/disk2/mainvol/.glusterfs/3c/7e/3c7e5913-9f70-468b-ae58-314048bc555a while doing xattrop: Key:trusted.ec.version [No such file or directory] [2016-04-01 11:48:46.297341] E [MSGID: 113001] [posix.c:5194:_posix_handle_xattr_keyvalue_pair] 0-mainvol-posix: getxattr failed on /mnt/disk2/mainvol/.glusterfs/03/39/0339da12-5bae-42fb-a8e8-ced5e4526547 while doing xattrop: Key:trusted.ec.version [No such file or directory] [2016-04-01 12:07:58.239150] E [MSGID: 113001] [posix.c:5194:_posix_handle_xattr_keyvalue_pair] 0-mainvol-posix: getxattr failed on /mnt/disk2/mainvol/.glusterfs/c3/36/c3363729-efc7-4336-8175-ad63aa6797bc while doing xattrop: Key:trusted.ec.version [No such file or directory] [2016-04-01 12:25:07.579365] E [MSGID: 115056] [server-rpc-fops.c:689:server_opendir_cbk] 0-mainvol-server: 3977749: OPENDIR <gfid:70c326b9-e98d-41fb-b1da-2f5e52440347>/FOLH1B (49a59e47-3f92-46d1-b745-6ae0a0e61db4) ==> (Permission denied) [Permission denied] [2016-04-01 12:25:33.602114] E [MSGID: 115056] [server-rpc-fops.c:689:server_opendir_cbk] 0-mainvol-server: 3980178: OPENDIR <gfid:70c326b9-e98d-41fb-b1da-2f5e52440347>/FOLH1B (49a59e47-3f92-46d1-b745-6ae0a0e61db4) ==> (Permission denied) [Permission denied] [2016-04-01 12:25:33.612870] E [MSGID: 115056] [server-rpc-fops.c:689:server_opendir_cbk] 0-mainvol-server: 3980186: OPENDIR <gfid:70c326b9-e98d-41fb-b1da-2f5e52440347>/GALNT11 (e49e2adf-dc3f-41f5-96d5-b14b40f35d5f) ==> (Permission denied) [Permission denied] [2016-04-01 12:33:45.228275] E [MSGID: 113001] [posix.c:5194:_posix_handle_xattr_keyvalue_pair] 0-mainvol-posix: getxattr failed on /mnt/disk2/mainvol/.glusterfs/3b/50/3b50d2e8-4956-4f96-ac19-3053a04bb676 while doing xattrop: Key:trusted.ec.version [No such file or directory] [2016-04-01 14:31:38.579305] E [MSGID: 113001] [posix.c:5194:_posix_handle_xattr_keyvalue_pair] 0-mainvol-posix: getxattr failed on /mnt/disk2/mainvol/.glusterfs/3f/3d/3f3d9c37-be02-48e0-973b-ef8c4f3c295c while doing xattrop: Key:trusted.ec.version [No such file or directory] [2016-04-01 14:52:40.571078] E [MSGID: 113001] [posix.c:5194:_posix_handle_xattr_keyvalue_pair] 0-mainvol-posix: getxattr failed on /mnt/disk2/mainvol/.glusterfs/b0/7b/b07b5553-a55d-4059-8777-a0ec40e51132 while doing xattrop: Key:trusted.ec.version [No such file or directory] [root@sm11 glusterfs]# tail *.log ==> cli.log <== [2016-04-03 07:33:00.323469] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2016-04-03 07:33:00.323577] I [socket.c:2356:socket_event_handler] 0-transport: disconnecting now [2016-04-03 07:33:00.420750] I [cli-rpc-ops.c:2139:gf_cli_set_volume_cbk] 0-cli: Received resp to set [2016-04-03 07:33:00.420987] I [input.c:36:cli_batch] 0-: Exiting with: 0 [2016-04-03 08:15:28.427738] I [cli.c:721:main] 0-cli: Started running gluster with version 3.7.9 [2016-04-03 08:15:28.436907] I [cli-cmd-volume.c:1795:cli_check_gsync_present] 0-: geo-replication not installed [2016-04-03 08:15:28.437338] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2016-04-03 08:15:28.437433] I [socket.c:2356:socket_event_handler] 0-transport: disconnecting now [2016-04-03 08:15:28.551696] I [cli-rpc-ops.c:2139:gf_cli_set_volume_cbk] 0-cli: Received resp to set [2016-04-03 08:15:28.551936] I [input.c:36:cli_batch] 0-: Exiting with: 0 ==> cmd_history.log <== [2016-04-01 08:00:00.856908] : volume set help : SUCCESS [2016-04-01 08:03:45.605250] : volume set help : SUCCESS [2016-04-03 06:26:59.505978] : volume set help : SUCCESS [2016-04-03 06:41:39.827425] : volume set help : SUCCESS [2016-04-03 06:41:53.469277] : volume set help : SUCCESS [2016-04-03 06:42:13.859466] : volume set help : SUCCESS [2016-04-03 07:06:58.119033] : volume set help : SUCCESS [2016-04-03 07:07:08.245910] : volume set help : SUCCESS [2016-04-03 07:33:00.420496] : volume set help : SUCCESS [2016-04-03 08:15:28.551440] : volume set help : SUCCESS ==> data.log <== [2016-03-30 06:17:21.185246] W [MSGID: 114060] [client-handshake.c:724:client3_3_reopen_cbk] 0-mainvol-client-9: reopen on <gfid:69707d8f-989a-4cba-b724-33db1e8b8bbe> failed. [Stale file handle] [2016-03-30 06:17:21.185988] W [MSGID: 114060] [client-handshake.c:724:client3_3_reopen_cbk] 0-mainvol-client-9: reopen on <gfid:9f4bafa4-b932-410a-877c-265edb553155> failed. [Stale file handle] [2016-03-30 06:17:21.186031] W [MSGID: 114060] [client-handshake.c:724:client3_3_reopen_cbk] 0-mainvol-client-9: reopen on <gfid:9f4bafa4-b932-410a-877c-265edb553155> failed. [Stale file handle] [2016-03-30 06:17:47.088748] W [MSGID: 122053] [ec-common.c:116:ec_check_status] 0-mainvol-disperse-1: Operation failed on some subvolumes (up=3F, mask=37, remaining=0, good=37, bad=8) The message "W [MSGID: 122035] [ec-common.c:419:ec_child_select] 0-mainvol-disperse-1: Executing operation with some subvolumes unavailable (8)" repeated 5 times between [2016-03-30 06:16:14.119064] and [2016-03-30 06:17:14.120278] [2016-03-30 06:17:21.155916] W [MSGID: 114060] [client-handshake.c:724:client3_3_reopen_cbk] 0-mainvol-client-9: reopen on <gfid:a293e6b6-357f-4cce-934e-f21757615648> failed. [Stale file handle] [2016-03-30 06:17:21.156052] W [MSGID: 114060] [client-handshake.c:724:client3_3_reopen_cbk] 0-mainvol-client-9: reopen on <gfid:fb838b06-bd89-4cd1-931d-49f16185e742> failed. [Stale file handle] The message "W [MSGID: 122053] [ec-common.c:116:ec_check_status] 0-mainvol-disperse-1: Operation failed on some subvolumes (up=3F, mask=37, remaining=0, good=37, bad=8)" repeated 3 times between [2016-03-30 06:17:47.088748] and [2016-03-30 06:18:07.271744] [2016-03-30 06:18:14.124001] W [MSGID: 122053] [ec-common.c:116:ec_check_status] 0-mainvol-disperse-1: Operation failed on some subvolumes (up=3F, mask=37, remaining=0, good=37, bad=8) [2016-04-01 03:14:01.680654] W [glusterfsd.c:1251:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7dc5) [0x7f425f2e8dc5] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7f42609538b5] -->/usr/sbin/glusterfs(cleanup_and_exit+0x69) [0x7f4260953739] ) 0-: received signum (15), shutting down ==> etc-glusterfs-glusterd.vol.log <== [2016-04-03 06:26:53.908779] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped [2016-04-03 06:26:59.507550] I [socket.c:3383:socket_submit_reply] 0-socket.management: not connected (priv->connected = -1) [2016-04-03 06:26:59.507588] E [rpcsvc.c:1314:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc cli, ProgVers: 2, Proc: 12) to rpc-transport (socket.management) [2016-04-03 06:26:59.507613] E [MSGID: 106430] [glusterd-utils.c:474:glusterd_submit_reply] 0-glusterd: Reply submission failed [2016-04-03 06:42:13.859506] I [socket.c:3383:socket_submit_reply] 0-socket.management: not connected (priv->connected = -1) [2016-04-03 06:42:13.859520] E [rpcsvc.c:1314:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc cli, ProgVers: 2, Proc: 12) to rpc-transport (socket.management) [2016-04-03 06:42:13.859534] E [MSGID: 106430] [glusterd-utils.c:474:glusterd_submit_reply] 0-glusterd: Reply submission failed [2016-04-03 07:07:08.245951] I [socket.c:3383:socket_submit_reply] 0-socket.management: not connected (priv->connected = -1) [2016-04-03 07:07:08.245966] E [rpcsvc.c:1314:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc cli, ProgVers: 2, Proc: 12) to rpc-transport (socket.management) [2016-04-03 07:07:08.245981] E [MSGID: 106430] [glusterd-utils.c:474:glusterd_submit_reply] 0-glusterd: Reply submission failed ==> glfsheal-mainvol.log <== ==> glustershd.log <== [2016-04-02 17:03:08.694924] E [MSGID: 114031] [client-rpc-fops.c:466:client3_3_open_cbk] 0-mainvol-client-5: remote operation failed. Path: <gfid:c5df439e-1c6e-4105-b6c2-014a7be439cd> (c5df439e-1c6e-4105-b6c2-014a7be439cd) [Transport endpoint is not connected] [2016-04-02 17:03:08.695053] E [MSGID: 114031] [client-rpc-fops.c:1624:client3_3_inodelk_cbk] 0-mainvol-client-5: remote operation failed [Transport endpoint is not connected] [2016-04-02 17:03:08.703770] E [rpc-clnt.c:362:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f2bc87bca52] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f2bc85878de] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f2bc85879ee] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7a)[0x7f2bc858937a] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f2bc8589ba8] ))))) 0-mainvol-client-11: forced unwinding frame type(GlusterFS 3.3) op(INODELK(29)) called at 2016-04-02 17:03:08.697566 (xid=0x4fb1e9) [2016-04-02 17:03:08.638750] W [MSGID: 122056] [ec-combine.c:866:ec_combine_check] 0-mainvol-disperse-1: Mismatching xdata in answers of 'LOOKUP' [2016-04-02 17:03:08.700878] E [MSGID: 114031] [client-rpc-fops.c:1624:client3_3_inodelk_cbk] 0-mainvol-client-5: remote operation failed [Transport endpoint is not connected] The message "E [MSGID: 114031] [client-rpc-fops.c:1624:client3_3_inodelk_cbk] 0-mainvol-client-11: remote operation failed [Transport endpoint is not connected]" repeated 7 times between [2016-04-02 17:03:08.628255] and [2016-04-02 17:03:08.748082] [2016-04-02 17:33:09.795869] E [rpc-clnt.c:201:call_bail] 0-mainvol-client-11: bailing out frame type(GlusterFS 3.3) op(OPEN(11)) xid = 0x4fb1f6 sent = 2016-04-02 17:03:08.750519. timeout = 1800 for 172.16.135.16:49153 [2016-04-02 17:33:09.795952] E [MSGID: 114031] [client-rpc-fops.c:466:client3_3_open_cbk] 0-mainvol-client-11: remote operation failed. Path: <gfid:db0c1d6c-f733-4bc3-8c76-0b1cc8d6cbe7> (db0c1d6c-f733-4bc3-8c76-0b1cc8d6cbe7) [Transport endpoint is not connected] [2016-04-02 18:01:59.992552] E [rpc-clnt.c:201:call_bail] 0-mainvol-client-11: bailing out frame type(GlusterFS 3.3) op(OPEN(11)) xid = 0x4fb221 sent = 2016-04-02 17:31:56.361972. timeout = 1800 for 172.16.135.16:49153 [2016-04-02 18:01:59.992618] E [MSGID: 114031] [client-rpc-fops.c:466:client3_3_open_cbk] 0-mainvol-client-11: remote operation failed. Path: <gfid:9afe87ba-855f-492c-901f-b618f5247705> (9afe87ba-855f-492c-901f-b618f5247705) [Transport endpoint is not connected] ==> mainvol-rebalance.log <== 230: volume mainvol 231: type debug/io-stats 232: option log-level WARNING 233: option latency-measurement off 234: option count-fop-hits off 235: subvolumes mainvol-dht 236: end-volume 237: +------------------------------------------------------------------------------+ [2016-04-01 05:03:14.643881] W [glusterfsd.c:1251:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7dc5) [0x7f9c219c9dc5] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7f9c230348b5] -->/usr/sbin/glusterfs(cleanup_and_exit+0x69) [0x7f9c23034739] ) 0-: received signum (15), shutting down ==> nfs.log <== [2016-04-01 12:54:42.426597] W [MSGID: 122053] [ec-common.c:116:ec_check_status] 0-mainvol-disperse-1: Operation failed on some subvolumes (up=3F, mask=3F, remaining=10, good=2D, bad=2) [2016-04-01 12:59:47.138952] E [MSGID: 114030] [client-rpc-fops.c:3022:client3_3_readv_cbk] 0-mainvol-client-4: XDR decoding failed [Invalid argument] [2016-04-01 12:59:47.139022] W [MSGID: 114031] [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-mainvol-client-4: remote operation failed [Invalid argument] [2016-04-01 12:59:47.141444] W [MSGID: 122053] [ec-common.c:116:ec_check_status] 0-mainvol-disperse-0: Operation failed on some subvolumes (up=3F, mask=3F, remaining=4, good=2B, bad=10) [2016-04-01 13:38:07.961739] E [MSGID: 114030] [client-rpc-fops.c:3022:client3_3_readv_cbk] 0-mainvol-client-7: XDR decoding failed [Invalid argument] [2016-04-01 13:38:07.962014] W [MSGID: 114031] [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-mainvol-client-7: remote operation failed [Invalid argument] [2016-04-01 13:38:07.964187] W [MSGID: 122053] [ec-common.c:116:ec_check_status] 0-mainvol-disperse-1: Operation failed on some subvolumes (up=3F, mask=3F, remaining=10, good=2D, bad=2) [2016-04-01 15:16:17.152097] E [MSGID: 114030] [client-rpc-fops.c:3022:client3_3_readv_cbk] 0-mainvol-client-1: XDR decoding failed [Invalid argument] [2016-04-01 15:16:17.159452] W [MSGID: 114031] [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-mainvol-client-1: remote operation failed [Invalid argument] [2016-04-01 15:16:17.159833] W [MSGID: 122053] [ec-common.c:116:ec_check_status] 0-mainvol-disperse-0: Operation failed on some subvolumes (up=3F, mask=3F, remaining=1, good=3C, bad=2)
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users