On Thu, Dec 6, 2018 at 12:44 AM Dongsheng Yang <dongsheng.yang@xxxxxxxxxxxx> wrote: > > > > On 12/06/2018 01:33 PM, Dongsheng Yang wrote: > > > > > > On 12/06/2018 12:16 AM, Jason Dillaman wrote: > >> On Wed, Dec 5, 2018 at 5:17 AM Ilya Dryomov <idryomov@xxxxxxxxx> wrote: > >>> On Wed, Dec 5, 2018 at 3:46 AM Dongsheng Yang > >>> <dongsheng.yang@xxxxxxxxxxxx> wrote: > >>>> Hi Ilya and Jason, > >>>> Maybe there is another option, umh (user mod helper): > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/umh.h > >>>> > >>>> We can provide a subcommand in userspace for journal replaying, and we > >>>> can check the journal in rbd map and reaquire exclusive-lock, if we > >>>> find there is uncommitted entry, we can call userspace helper by > >>>> umh to > >>>> replay it. > >>> Yes, making an upcall from the kernel might be an option, but the > >>> problem is that this can happen deep in the I/O path. I'm not sure > >>> it's safe wrt memory allocation deadlocks because the helper is ran > >>> out of a regular workqueue, etc. > >>> > >>> Another option might be to daemonize "rbd map" process. > >>> > >>> Or maybe attempting a minimal replay in the kernel and going read-only > >>> in case something is wrong is actually fine as a starting point... > >> I'd vote for daemonizing "rbd map" (or a similar small, purpose-built > >> tool). > > > > +1 for this. > Can we introduce a new service to watch all images mapped on a node? > If we start a new process for each rbd map, that's a little expensive. we > can start a single service to watch all of them, register in rbd map and > unregister in rbd unmap. What's your concern re: the expense of one daemon per image? If you have a single daemon per-node, what's your communication channel to alert the daemon of map/unmap events? > >> The local "rbd-mirror" daemon process doesn't currently open a > >> watch on local primary images and in the case of one-way mirroring, > >> you would potentially not even have an "rbd-mirror" daemon running > >> locally. > >> > >>> Thanks, > >>> > >>> Ilya > >> > >> > > > > > > > > -- Jason