On 01/25/2016 11:10 PM, David Robinson
wrote:
David, I see a lot of traffic from [f]inodelks: 15:09:00 :) ⚡ grep wind_from data-brick02a-homegfs.4066.dump.1453742225 | sort | uniq -c 11 unwind_from=default_finodelk_cbk 11 unwind_from=io_stats_finodelk_cbk 11 unwind_from=pl_common_inodelk 1133 wind_from=default_finodelk_resume 1 wind_from=default_inodelk_resume 75 wind_from=index_getxattr 6 wind_from=io_stats_entrylk 12776 wind_from=io_stats_finodelk 15 wind_from=io_stats_flush 75 wind_from=io_stats_getxattr 4 wind_from=io_stats_inodelk 4 wind_from=io_stats_lk 4 wind_from=io_stats_setattr 75 wind_from=marker_getxattr 4 wind_from=marker_setattr 75 wind_from=quota_getxattr 6 wind_from=server_entrylk_resume 12776 wind_from=server_finodelk_resume <<-------------- 15 wind_from=server_flush_resume 75 wind_from=server_getxattr_resume 4 wind_from=server_inodelk_resume 4 wind_from=server_lk_resume 4 wind_from=server_setattr_resume But very less number of active locks: pk1@localhost - ~/Downloads 15:09:07 :) ⚡ grep ACTIVE data-brick02a-homegfs.4066.dump.1453742225 inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=9223372036854775806, len=0, pid = 11678, owner=b42fff03ce7f0000, client=0x13d2cd0, connection-id=corvidpost3.corvidtec.com-52656-2016/01/22-16:40:31:459920-homegfs-client-6-0-1, granted at 2016-01-25 17:16:06 inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid = 15759, owner=b8ca8c0100000000, client=0x189e470, connection-id=corvidpost4.corvidtec.com-17718-2016/01/22-16:40:31:221380-homegfs-client-6-0-1, granted at 2016-01-25 17:12:52 inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=9223372036854775806, len=0, pid = 7103, owner=0cf31a98f87f0000, client=0x2201d60, connection-id=zlv-bangell-4812-2016/01/25-13:45:52:170157-homegfs-client-6-0-0, granted at 2016-01-25 17:09:56 inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=9223372036854775806, len=0, pid = 55764, owner=882dbea1417f0000, client=0x17fc940, connection-id=corvidpost.corvidtec.com-35961-2016/01/22-16:40:31:88946-homegfs-client-6-0-1, granted at 2016-01-25 17:06:12 inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=9223372036854775806, len=0, pid = 21129, owner=3cc068a1e07f0000, client=0x1495040, connection-id=corvidpost2.corvidtec.com-43400-2016/01/22-16:40:31:248771-homegfs-client-6-0-1, granted at 2016-01-25 17:15:53 One more odd thing I found is the following: [2016-01-15 14:03:06.910687] C [rpc-clnt-ping.c:109:rpc_clnt_ping_timer_expired] 0-homegfs-client-2: server 10.200.70.1:49153 has not responded in the last 10 seconds, disconnecting. [2016-01-15 14:03:06.910886] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x2b74c289a580] (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x2b74c2b27787] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x2b74c2b2789e] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x91)[0x2b74c2b27951] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15f)[0x2b74c2b27f1f] ))))) 0-homegfs-client-2: forced unwinding frame type(GlusterFS 3.3) op(FINODELK(30)) called at 2016-01-15 10:30:09.487422 (xid=0x11ed3f) FINODELK is called at 2016-01-15 10:30:09.487422 but the response still didn't come till 14:03:06. That is almost 3.5 hours!! Something really bad related to locks is happening. Did you guys patch the recent memory corruption bug which only affects workloads with more than 128 clients? http://review.gluster.org/13241 Pranith
|
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel