Re: [RFC][PATCH 0/3] dm-raid1: fix deadlock at suspend after suspend was interrupted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Yasui-san,

On 01/20/2010 05:40 AM +0900, Takahiro Yasui wrote:
> Hi,
> 
> This is a patch set to fix deadlock on suspending of mirror device.
> 
> 
> ISSUE
> =====
> 
> Suspend procedure on a dm-mirror device could cause deadlock on recovery_count
> semaphore.
> 
> When mirror_presuspend is called, recovery_count semaphore is acquired in
> dm_rh_stop_recovery() to stop recovery routine, but when an signal is caught
> in dm_wait_for_completion() or an error occurred in in dm_suspend(),
> the suspend process is interrupted without releasing recovery_count semaphore
> of a mirror device. This means that another suspend is executed, and then
> the suspend process gets stuck at dm_rh_stop_recovery().
> 
> When suspend procedure is interrupted, the device should work properly since
> the status of the device is not "suspended."
> 
> 
> SOLUTION
> ========
> 
> Introduce a target handler, cancel_presuspend, to cancel status changes
> done by a target specific presuspend handler.

How about using ->resume as a cancelling method?
Though you have to audit existing targets' ->resume handler,
I think it's better idea than adding another target handler
just for this purpose.

And in your dm-raid1 patch, cancelling log's presuspend which is used
by dm-log-userspace is missed.
So it seems that dm-raid1 can use ->resume to cancel presuspend.

Thanks,
Kiyoshi Ueda

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel

[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux