Hi Yasui-san, On 01/20/2010 05:40 AM +0900, Takahiro Yasui wrote: > Hi, > > This is a patch set to fix deadlock on suspending of mirror device. > > > ISSUE > ===== > > Suspend procedure on a dm-mirror device could cause deadlock on recovery_count > semaphore. > > When mirror_presuspend is called, recovery_count semaphore is acquired in > dm_rh_stop_recovery() to stop recovery routine, but when an signal is caught > in dm_wait_for_completion() or an error occurred in in dm_suspend(), > the suspend process is interrupted without releasing recovery_count semaphore > of a mirror device. This means that another suspend is executed, and then > the suspend process gets stuck at dm_rh_stop_recovery(). > > When suspend procedure is interrupted, the device should work properly since > the status of the device is not "suspended." > > > SOLUTION > ======== > > Introduce a target handler, cancel_presuspend, to cancel status changes > done by a target specific presuspend handler. How about using ->resume as a cancelling method? Though you have to audit existing targets' ->resume handler, I think it's better idea than adding another target handler just for this purpose. And in your dm-raid1 patch, cancelling log's presuspend which is used by dm-log-userspace is missed. So it seems that dm-raid1 can use ->resume to cancel presuspend. Thanks, Kiyoshi Ueda -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel