I tried to trace down shutdown call with signal SIGTERM in osd. It seems shutdown call never reached BlueStore::umount. Steps: 1. Starst osd. 2. Attached gdb and put breakpoints: (gdb) info b Num Type Disp Enb Address What 1 breakpoint keep y 0x00007f5f8ba5e8a0 in OSD::shutdown() at osd/OSD.cc:2599 2 breakpoint keep y 0x00007f5f8bd65160 in BlueStore::umount() at os/bluestore/BlueStore.cc:2686 3. Trigger stop.sh Breakpoint 1 is hit but it never hits second breakpoints. It get stuck somewhere in call: #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f5f8ba515b5 in WaitUntil (when=..., mutex=..., this=0x7f5f95428e20) at ./common/Cond.h:72 #2 OSDService::prepare_to_stop (this=this@entry=0x7f5f954275c8) at osd/OSD.cc:1174 #3 0x00007f5f8ba5e8cb in OSD::shutdown (this=this@entry=0x7f5f95426000) at osd/OSD.cc:2600 #4 0x00007f5f8ba604d0 in OSD::handle_signal (this=0x7f5f95426000, signum=<optimized out>) at osd/OSD.cc:1739 #5 0x00007f5f8c0209b7 in SignalHandler::entry (this=0x7f5f952a8560) at global/signal_handler.cc:252 #6 0x00007f5f89f19182 in start_thread (arg=0x7f5f686d7700) at pthread_create.c:312 #7 0x00007f5f87e2c00d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Any idea what is happening? I think it working with recovery in place but never does graceful shutdown? Or I am missing anything here? -Ramesh > -----Original Message----- > From: Sage Weil [mailto:sweil@xxxxxxxxxx] > Sent: Friday, July 01, 2016 7:32 PM > To: Ramesh Chander > Cc: Brad Hubbard; Somnath Roy; ceph-devel@xxxxxxxxxxxxxxx > Subject: RE: SIGTERM and osd close > > On Fri, 1 Jul 2016, Ramesh Chander wrote: > > Thank you all for reply, > > > > Brad, > > > > I should trace the code path you pointed out. > > In this case, the important bit is BlueFS::umount(), which calls > BlueFS::_stop_alloc(). BlueStore::_close_db() should be calling > bluefs->umount(). Any of the unit tests should be triggering these code > paths. > > sage > > > > > > -Regards, > > Ramesh > > > > > -----Original Message----- > > > From: Brad Hubbard [mailto:bhubbard@xxxxxxxxxx] > > > Sent: Friday, July 01, 2016 3:48 AM > > > To: Somnath Roy > > > Cc: Ramesh Chander; ceph-devel@xxxxxxxxxxxxxxx > > > Subject: Re: SIGTERM and osd close > > > > > > On Fri, Jul 1, 2016 at 4:44 AM, Somnath Roy > <Somnath.Roy@xxxxxxxxxxx> > > > wrote: > > > > You need to call it from BlueStore::umount() I guess for cleanup work.. > > > > > > > > -----Original Message----- > > > > From: ceph-devel-owner@xxxxxxxxxxxxxxx > > > > [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Ramesh > > > Chander > > > > Sent: Thursday, June 30, 2016 8:46 AM > > > > To: ceph-devel@xxxxxxxxxxxxxxx > > > > Subject: SIGTERM and osd close > > > > > > > > Hi All, > > > > > > > > When I use stop.sh without any argument, I suppose it calls pkill with > > > SIGTERM on osds as well as other processes. > > > > > > 616 // install signal handlers > > > 617 init_async_signal_handler(); > > > 618 register_async_signal_handler(SIGHUP, sighup_handler); > > > 619 register_async_signal_handler_oneshot(SIGINT, > handle_osd_signal); > > > 620 register_async_signal_handler_oneshot(SIGTERM, > handle_osd_signal); > > > > > > 65 void handle_osd_signal(int signum) > > > 66 { > > > 67 if (osd) > > > 68 osd->handle_signal(signum); > > > 69 } > > > > > > 1735 void OSD::handle_signal(int signum) > > > 1736 { > > > 1737 assert(signum == SIGINT || signum == SIGTERM); > > > 1738 derr << "*** Got signal " << sig_str(signum) << " ***" << dendl; > > > 1739 shutdown(); > > > 1740 } > > > > > > 2598 int OSD::shutdown() > > > 2599 { > > > > > > OSD::shutdown() in src/osd/OSD.cc is quite a large function that performs > > > quite a bit of clean up such as draining and shutting down thread pool > work > > > queues, shutting down messenger instances, un-registering admin > > > commands, shutting down the PGs, flushing outstanding ops, updating > the > > > superblock and unmounting the filestore (as Somnath mentioned this > might > > > be where you want to look), shutting down the MON client and clearing > the > > > peering work queue, in no particular order. > > > > > > So there is no doubt the OSD (and other daemons such as MON and > MDS) > > > intercepts this signal and performs a graceful shutdown including many > > > housekeeping tasks. > > > > > > HTH, > > > Brad > > > > > > > > > > > Does osd handle this signal and take care of closing all components? > > > > > > > > I am specifically interested in if it closes objectstore -> keyvaluedb . > > > > > > > > I don't see my code of keyvaluedb shutdown/close being called when I > > > > do ./stop.sh > > > > > > > > Any argument or way to force this? > > > > > > > > -Ramesh > > > > PLEASE NOTE: The information contained in this electronic mail message > is > > > intended only for the use of the designated recipient(s) named above. If > the > > > reader of this message is not the intended recipient, you are hereby > notified > > > that you have received this message in error and that any review, > > > dissemination, distribution, or copying of this message is strictly > prohibited. If > > > you have received this communication in error, please notify the sender > by > > > telephone or e-mail (as shown above) immediately and destroy any and > all > > > copies of this message in your possession (whether hard copies or > > > electronically stored copies). > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > > > in the body of a message to majordomo@xxxxxxxxxxxxxxx More > > > majordomo > > > > info at http://vger.kernel.org/majordomo-info.html > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > > > in the body of a message to majordomo@xxxxxxxxxxxxxxx More > > > majordomo > > > > info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > > > > -- > > > Cheers, > > > Brad > > PLEASE NOTE: The information contained in this electronic mail message is > intended only for the use of the designated recipient(s) named above. If the > reader of this message is not the intended recipient, you are hereby notified > that you have received this message in error and that any review, > dissemination, distribution, or copying of this message is strictly prohibited. If > you have received this communication in error, please notify the sender by > telephone or e-mail (as shown above) immediately and destroy any and all > copies of this message in your possession (whether hard copies or > electronically stored copies). > > N?????r??y??????X??ǧv???){.n?????z?]z????ay?ʇڙ??j > ??f???h??????w??? > > ???j:+v???w???????? ????zZ+???????j"????i PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). ��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f