On Mon, Jul 13, 2020 at 09:59:43PM +0200, Martin Wilck wrote: > On Mon, 2020-07-13 at 11:56 +0200, Zdenek Kabelac wrote: > > > > > @Zdenek, do we have to protect every libdm call, or is it > > > sufficient > > > to protect only dm_task_run(), as lixiaokeng suggested? > > > > > > > Hi > > > > It's actually hard to answer it in a simple way. > > Several properties are held in library static variables. > > > > ... > > > > As for the issue of keeping control_fd open - there should be a > > synchronized > > call of dm_hold_control_dev(0/1) - see the codebase of dmeventd.c > > in lvm2 > > code base - how we solve the control_fd handling. > > I've made an assessment of the libdm calls that multipath-tools use. > > Most of them are trivial getters and setters and need no locking, > because they don't operate on any static or global data structures. > > The exceptions I found are are listed here: > > dm_driver_version > dm_hold_control_dev > dm_is_dm_major > dm_lib_exit > dm_lib_release > dm_log_init > dm_log_init_verbose > dm_task_create > dm_task_run > dm_task_set_cookie > dm_udev_set_sync_support > dm_udev_wait > > The reported race around _control_fd could probably be fixed simply > by calling dm_hold_control_dev() and dm_driver_version() early on e.g. > via pthread_once(), because dm_driver_version() calls dm_task_create() > and dm_task_run(), and thus implicitly initializes the _control_fd. > (libmultipath currently doesn't do these calls in the right order). > This should also avoid the need for locking dm_is_dm_major() (access to > _bitset) and dm_task_create (check_version()). The early init function > could also call dm_log_init(), dm_log_init_verbose(), and > dm_udev_set_sync_support(), setting up the libdm parameters once and > for all. > > However, there are other possible races, as you noted. Mainly related > to manipulating the node_ops stack - this concerns dm_task_run(), > dm_udev_wait(), dm_lib_release(), dm_lib_exit(). I'm uncertain about > dm_task_set_cookie(). I suppose it doesn't need protecting, but I tend > to be on the safe side. > > Anyway, that's my summary: the 4 or 5 functions mentioned in the > previous paragraph would need wrapping under a lock. The rest can be > handled by making sure the initialization is made early on, and only > once. > > Comments welcome. We've agreed that dm_lib_release() is unnecessary, since we always call dm_udev_wait() when update_devs() is needed. dm_lib_exit() should already be safe, since we only call it after cleaning up the other threads. dm_task_set_cookie() looks safe to me as well, assuming that we have early initialization. So we are really talking about dm_task_run() and dm_udev_wait(). Clearly dm_udev_wait() always needs to run under a lock. Two threads trying to remove items from a list at the same time isn't safe, and there's no way to know if there will be items on the _node_ops list. The one thing I'm not sure of, is whether every call to dm_task_run() always needs to be locked. clearly we could, and it would be safer. Also, clearly for calls that add elements to _node_ops it must. But for calls like DM_DEVICE_TABLE or DM_DEVICE_LIST, the only issue that I can see with concurrent calls is the possibility that _ioctl_buffer_double_factor gets increased too much. Perhaps that's enough of a problem that we should just lock all dm_task_run() calls as well. -Ben > Regards > Martin > -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel