I left it there for further trouble shooting, however, I guess one of my colleague restart the cosd so I no longer have the stack trace. Let me see if I can reproduce the issue again. Regards, Leander Yu. On Tue, Oct 5, 2010 at 11:44 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > On Tue, 5 Oct 2010, Leander Yu wrote: >> use gdb to attach to the cosd process, most of the threads are in >> pthred_mutext_lock , however there are two thread keep waiting for >> pg->lock() > > Yeah. Either a thread holding the lock is deadlocked somewhere, or the > lock was leaked. Do you still have the stack trace from the other > threads? > > sage > >> >> ------------ Thread 184 ----------------- >> #0 0x00007f7b801ecc44 in __lll_lock_wait () from /lib64/libpthread.so.0 >> #1 0x00007f7b801e7f15 in _L_lock_1056 () from /lib64/libpthread.so.0 >> #2 0x00007f7b801e7de7 in pthread_mutex_lock () from /lib64/libpthread.so.0 >> #3 0x000000000047509a in ?? () >> #4 0x00000000004b4181 in C_OSD_Commit::finish(int) () >> #5 0x00000000005b57d8 in Finisher::finisher_thread_entry() () >> #6 0x000000000046c61a in Thread::_entry_func(void*) () >> #7 0x00007f7b801e5a3a in start_thread () from /lib64/libpthread.so.0 >> #8 0x00007f7b7f40377d in clone () from /lib64/libc.so.6 >> #9 0x0000000000000000 in ?? () >> >> -------------- Thread 182 ------------------- >> #0 0x00007f7b801ecc44 in __lll_lock_wait () from /lib64/libpthread.so.0 >> #1 0x00007f7b801e7f15 in _L_lock_1056 () from /lib64/libpthread.so.0 >> #2 0x00007f7b801e7de7 in pthread_mutex_lock () from /lib64/libpthread.so.0 >> #3 0x00000000004eb0fa in Mutex::Lock(bool) () >> #4 0x00000000004bac90 in OSD::_lookup_lock_pg(pg_t) () >> #5 0x00000000004e6e7f in OSD::handle_sub_op_reply(MOSDSubOpReply*) () >> #6 0x00000000004e871d in OSD::_dispatch(Message*) () >> #7 0x00000000004e8fa9 in OSD::ms_dispatch(Message*) () >> #8 0x000000000045eab9 in SimpleMessenger::dispatch_entry() () >> #9 0x00000000004589fc in SimpleMessenger::DispatchThread::entry() () >> #10 0x000000000046c61a in Thread::_entry_func(void*) () >> #11 0x00007f7b801e5a3a in start_thread () from /lib64/libpthread.so.0 >> #12 0x00007f7b7f40377d in clone () from /lib64/libc.so.6 >> #13 0x0000000000000000 in ?? () >> >> It seems the dispatch thread was blocked so no message can be handle? >> >> Regards, >> Leander Yu. >> >> >> On Tue, Oct 5, 2010 at 2:04 PM, Leander Yu <leander.yu@xxxxxxxxx> wrote: >> > it seem like the cosd(asgc-osd9) is idle now. it didn't send any >> > hearbeat out. but the process is still running. >> > This is the netstat output >> > --------------------------------------------- >> > Proto Recv-Q Send-Q Local Address Foreign Address >> > State >> > tcp 0 0 0.0.0.0:8649 0.0.0.0:* >> > LISTEN >> > tcp 0 0 0.0.0.0:8139 0.0.0.0:* >> > LISTEN >> > tcp 0 0 0.0.0.0:587 0.0.0.0:* >> > LISTEN >> > tcp 0 0 0.0.0.0:8651 0.0.0.0:* >> > LISTEN >> > tcp 0 0 0.0.0.0:8652 0.0.0.0:* >> > LISTEN >> > tcp 0 0 192.168.1.9:50000 0.0.0.0:* >> > LISTEN >> > tcp 0 0 0.0.0.0:6800 0.0.0.0:* >> > LISTEN >> > tcp 0 0 192.168.1.9:50001 0.0.0.0:* >> > LISTEN >> > tcp 0 0 0.0.0.0:465 0.0.0.0:* >> > LISTEN >> > tcp 0 0 0.0.0.0:6801 0.0.0.0:* >> > LISTEN >> > tcp 0 0 0.0.0.0:22 0.0.0.0:* >> > LISTEN >> > tcp 0 0 192.168.1.9:50008 0.0.0.0:* >> > LISTEN >> > tcp 0 0 0.0.0.0:25 0.0.0.0:* >> > LISTEN >> > tcp 0 0 192.168.1.9:50011 0.0.0.0:* >> > LISTEN >> > tcp 0 0 127.0.0.1:8649 127.0.0.1:38054 >> > TIME_WAIT >> > tcp 0 0 127.0.0.1:8649 127.0.0.1:38055 >> > TIME_WAIT >> > tcp 0 0 192.168.1.9:22 192.168.1.101:35785 >> > ESTABLISHED >> > tcp 0 0 127.0.0.1:8649 127.0.0.1:38053 >> > TIME_WAIT >> > tcp 0 0 192.168.1.9:22 192.168.1.101:56473 >> > ESTABLISHED >> > tcp 0 0 127.0.0.1:8649 127.0.0.1:38052 >> > TIME_WAIT >> > tcp 0 0 :::587 :::* >> > LISTEN >> > tcp 0 0 :::465 :::* >> > LISTEN >> > tcp 0 0 :::22 :::* >> > LISTEN >> > tcp 0 0 :::25 :::* >> > LISTEN >> > >> > Regards, >> > Leander Yu. >> > >> > >> > >> > On Tue, Oct 5, 2010 at 1:40 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: >> >> On Tue, 5 Oct 2010, Leander Yu wrote: >> >>> I have another OSD was marked as down however I can still access the >> >>> machine by ssh and I saw the cosd process is running. >> >>> the log shows the same pipe fault error like: >> >>> >> >>> 192.168.1.9:6801/1537 >> 192.168.1.25:6801/29084 pipe(0x7f7b680e2620 >> >>> sd=-1 pgs=437 cs=1 l=0).fault with nothing to send, going to standby >> >> >> >> That error means there was a socket error (usually connection dropped, >> >> but it could lots of things), but the connection wasn't in use. >> >> >> >> This one looks like the heartbeat channel. Most likely that connection >> >> reconnected shortly after that (the osds send heartbeats every couple >> >> seconds). They're marked down when peer osds expected a heartbeat and >> >> don't get one. The monitor log ($mon_data/log) normally has information >> >> about who reported the failure, but it looks like you've turned it off. >> >> >> >> In any case, usually the error is harmless. And probably unrelated to the >> >> MDS error (unless perhaps the same network glitch was to blame). >> >> >> >> sage >> >> >> >> >> >> >> >>> >> >>> are those two cases related? >> >>> >> >>> Regards, >> >>> Leander Yu. >> >>> >> >>> >> >>> On Tue, Oct 5, 2010 at 1:15 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: >> >>> > On Tue, 5 Oct 2010, Leander Yu wrote: >> >>> >> Hi Sage, >> >>> >> Thanks a lot for your prompt answer. >> >>> >> So is the behavior normal? I mean if we assume there was a network issue. >> >>> >> In this case will it be better to restart the mds instead of suicide? >> >>> >> or leave it there as standby? >> >>> > >> >>> > The mds has lots of internal state that would be tricky to clean up >> >>> > properly, so one way or another the old instance should die. >> >>> > >> >>> > But you're right: probably it should just respawn a new instance instead >> >>> > of exiting? The new instance will come back up in standby mode. Maybe >> >>> > re-exec with the same set of arguments the original instance was exectued >> >>> > with? >> >>> > >> >>> > sage >> >>> > >> >>> > >> >>> >> >> >>> >> Regards, >> >>> >> Leander Yu. >> >>> >> >> >>> >> On Tue, Oct 5, 2010 at 1:00 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: >> >>> >> > On Mon, 4 Oct 2010, Sage Weil wrote: >> >>> >> >> On Tue, 5 Oct 2010, Leander Yu wrote: >> >>> >> >> > Hi, >> >>> >> >> > I have a 46 machines cluster(44 osd/mon + 2 mds) running ceph now. MDS >> >>> >> >> > is running in active/standby mode. >> >>> >> >> > This morning one of the MDS suicide the log shows: >> >>> >> >> > >> >>> >> >> > ------------------------------------------- >> >>> >> >> > 2010-10-04 22:24:19.450022 7f2e5a1ee710 mds0.cache.ino(10000002b87) >> >>> >> >> > pop_projected_snaprealm 0x7f2e50cd9f70 seq1 >> >>> >> >> > 2010-10-04 22:26:12.180854 7f2debbfb710 -- 192.168.1.103:6800/2081 >> >> >>> >> >> > 192.168.1.106:0/2453428678 pipe(0x7f2e380013d0 sd=-1 pgs=2 cs=1 >> >>> >> >> > l=0).fault with nothing to send, going to standby >> >>> >> >> > 2010-10-04 22:26:12.181019 7f2e481dc710 -- 192.168.1.103:6800/2081 >> >> >>> >> >> > 192.168.1.111:0/18905730 pipe(0x7f2e38002250 sd=-1 pgs=2 cs=1 >> >>> >> >> > l=0).fault with nothing to send, going to standby >> >>> >> >> > 2010-10-04 22:26:12.181041 7f2dc3fff710 -- 192.168.1.103:6800/2081 >> >> >>> >> >> > 192.168.1.114:0/1945631186 pipe(0x7f2e38000f00 sd=-1 pgs=2 cs=1 >> >>> >> >> > l=0).fault with nothing to send, going to standby >> >>> >> >> > 2010-10-04 22:26:12.181149 7f2deaef6710 -- 192.168.1.103:6800/2081 >> >> >>> >> >> > 192.168.1.113:0/521184914 pipe(0x7f2e38002f90 sd=-1 pgs=2 cs=1 >> >>> >> >> > l=0).fault with nothing to send, going to standby >> >>> >> >> > 2010-10-04 22:26:12.181563 7f2deb5f5710 -- 192.168.1.103:6800/2081 >> >> >>> >> >> > 192.168.1.112:0/4272114728 pipe(0x7f2e38002ac0 sd=-1 pgs=2 cs=1 >> >>> >> >> > l=0).fault with nothing to send, going to standby >> >>> >> >> > 2010-10-04 22:26:13.777624 7f2e5a1ee710 mds-1.3 handle_mds_map i >> >>> >> >> > (192.168.1.103:6800/2081) dne in the mdsmap, killing myself >> >>> >> >> > 2010-10-04 22:26:13.777649 7f2e5a1ee710 mds-1.3 suicide. wanted >> >>> >> >> > up:active, now down:dne >> >>> >> >> > 2010-10-04 22:26:13.777769 7f2e489e4710 -- 192.168.1.103:6800/2081 >> >> >>> >> >> > 192.168.1.101:0/15702 pipe(0x7f2e380008c0 sd=-1 pgs=1847 cs=1 >> >>> >> >> > l=0).fault with nothing to send, going to standby >> >>> >> >> > ------------------------------------------------------------------------------ >> >>> >> >> > Would you suggest how do I trouble shooting this issue? or should I >> >>> >> >> > just restart the mds to recover it? >> >>> >> >> >> >>> >> >> The MDS killed itself because it was removed from the mdsmap. The >> >>> >> >> monitor log will tell you why if you had logging turned up. If not, you >> >>> >> >> might find some clue by looking at each mdsmap iteration. If you do >> >>> >> >> >> >>> >> >> $ ceph mds stat >> >>> >> >> >> >>> >> >> it will tell you the map epoch (e###). You can then dump any map >> >>> >> >> iteration with >> >>> >> >> >> >>> >> >> $ ceph mds dump 123 -o - >> >>> >> >> >> >>> >> >> Work backward a few iterations until you find which epoch removed that mds >> >>> >> >> instance. The one prior to that might have some clue (maybe it was >> >>> >> >> laggy?)... >> >>> >> > >> >>> >> > Okay, looking at the maps on your cluster, it looks like there was a >> >>> >> > standby mds, and the live one was marked down. Probably some intermittent >> >>> >> > network issue preventing it from sending the monitor beacon on time, and >> >>> >> > the monitor decided it was dead/unresponsive. The standby cmds took over >> >>> >> > successfully. The recovery looks like it took about 20 seconds. >> >>> >> > >> >>> >> > sage >> >>> >> > >> >>> >> -- >> >>> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> >>> >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> >>> >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >>> >> >> >>> >> >> >>> -- >> >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >>> >> >>> >> > >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html