Hi, ----- Original Message ----- > From: "Sage Weil" <sweil@xxxxxxxxxx> > To: "Milosz Tanski" <milosz@xxxxxxxxx> > Cc: "Gregory Farnum" <gfarnum@xxxxxxxxxx>, "Ilya Dryomov" <idryomov@xxxxxxxxx>, "Igor Podoski" > <Igor.Podoski@xxxxxxxxxxxxxx>, "ceph-devel" <ceph-devel@xxxxxxxxxxxxxxx> > Sent: Friday, March 25, 2016 11:12:34 AM > Subject: Re: Ceph watchdog-like thing to reduce IO block during process goes down by abort() > > There's no reason the watcher process can't be a child that's kicked > > off when the OSD startups. If there's a pipe between the two, when the > > parent goes away the child will get a EOF on reading from the pipe. On > > Linux you can also do a cute trick to have the child notified when > > parent quits using prctl(PR_SET_PDEATHSIG, SIG???). > > That does simplify the startup/management piece, but it means one watcher > per OSD, and since we want the watcher to have an active mon session to > make the notification quick, it doubles the mon session load. Not sure if it's helpful, but a) the count of sessions drops back once logical OSDs are colocated? b) AFS had the notion of the "basic overseer" that started all its other processes--so you I think had the pipe infrastructure set up to do this sort of thing more like in Milosz' model, but just the one overseer per host (But I might be misreading the thread.) > > Honestly I don't think the separate daemon is that much of an issue--it's > a systemd unit file and a pretty simple watchdog process. The key > management and systemd enable/activate bit is the part that will be > annoying. > > sage > -- Matt -- Matt Benjamin Red Hat, Inc. 315 West Huron Street, Suite 140A Ann Arbor, Michigan 48103 http://www.redhat.com/en/technologies/storage tel. 734-707-0660 fax. 734-769-8938 cel. 734-216-5309 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html