On 12/06/2018 10:35 PM, Jason Dillaman wrote:
On Thu, Dec 6, 2018 at 12:44 AM Dongsheng Yang
<dongsheng.yang@xxxxxxxxxxxx> wrote:
On 12/06/2018 01:33 PM, Dongsheng Yang wrote:
On 12/06/2018 12:16 AM, Jason Dillaman wrote:
On Wed, Dec 5, 2018 at 5:17 AM Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
On Wed, Dec 5, 2018 at 3:46 AM Dongsheng Yang
<dongsheng.yang@xxxxxxxxxxxx> wrote:
Hi Ilya and Jason,
Maybe there is another option, umh (user mod helper):
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/umh.h
We can provide a subcommand in userspace for journal replaying, and we
can check the journal in rbd map and reaquire exclusive-lock, if we
find there is uncommitted entry, we can call userspace helper by
umh to
replay it.
Yes, making an upcall from the kernel might be an option, but the
problem is that this can happen deep in the I/O path. I'm not sure
it's safe wrt memory allocation deadlocks because the helper is ran
out of a regular workqueue, etc.
Another option might be to daemonize "rbd map" process.
Or maybe attempting a minimal replay in the kernel and going read-only
in case something is wrong is actually fine as a starting point...
I'd vote for daemonizing "rbd map" (or a similar small, purpose-built
tool).
+1 for this.
Can we introduce a new service to watch all images mapped on a node?
If we start a new process for each rbd map, that's a little expensive. we
can start a single service to watch all of them, register in rbd map and
unregister in rbd unmap.
What's your concern re: the expense of one daemon per image? If you
As we probably have lots of images mapped on a node, when we
are using k8s with krbd. Then each mapping would start a new process
to watch image. But actually the work it need to do is very simple
and rare, replaying entries. So I think that's a little waste.
have a single daemon per-node, what's your communication channel to
alert the daemon of map/unmap events?
Maybe sharedmemory, we can define a unique shm_key for this
sharememroy, (what I did in other project is to find a num from PI,
such as "983367" is the 501st ~ 506th number in Pi)
Then the deamon can create the sharememory in starting and destroy
it in stopping.
rbd map command can find the sharememory with specified shm_key.
If found, register it; otherwise, fail to map. we can also share the lock
in sharedmemory and solve the problem of process killed without
unlock by PTHREAD_MUTEX_ROBUST.
That's just an option, we can make it as an future improvement if we
really find that's worth to do that. :)
Thanx
The local "rbd-mirror" daemon process doesn't currently open a
watch on local primary images and in the case of one-way mirroring,
you would potentially not even have an "rbd-mirror" daemon running
locally.
Thanks,
Ilya