Re: [PATCH v3 16/18] scsi: ufs: Synchronize SCSI and UFS error handling

Adrian Hunter <adrian.hunter@xxxxxxxxx> · Tue, 31 Aug 2021 10:24:18 +0300

On 30/08/21 1:18 am, Bart Van Assche wrote:
> On 8/28/21 02:47, Adrian Hunter wrote:
>> There is a deadlock that seems to be related to this patch, because now
>> requests are blocked while the error handler waits on the host_sem.
> 
> Hi Adrian,
> 
> Some but not all of the issues mentioned below have been introduced by patch 16/18. Anyway, thank you for having shared your concerns.
> 
>> Example:
>>
>> ufshcd_err_handler() races with ufshcd_wl_suspend() for host_sem.
>> ufshcd_wl_suspend() wins the race but now PM requests deadlock:
>>
>> because:
>>   scsi_queue_rq() -> scsi_host_queue_ready() -> scsi_host_in_recovery() is FALSE
>>
>> because:
>>   scsi_schedule_eh() has done:
>>         scsi_host_set_state(shost, SHOST_RECOVERY) == 0 ||
>>         scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY) == 0)
>>
>>
>> Some questions for thought:
>>
>> Won't any holder of host_sem deadlock if it tries to do SCSI requests
>> and the error handler is waiting on host_sem?
>>
>> Won't runtime resume deadlock if it is initiated by the error handler?
> 
> My understanding is that host_sem is used for the following purposes:
> - To prevent that sysfs attributes are read or written after shutdown
>   has started (hba->shutting_down).
> - To serialize sysfs attribute access, clock scaling, error handling,
>   the ufshcd_probe_hba() call from ufshcd_async_scan() and hibernation.
> 
> I propose to make the following changes:
> - Instead of checking the value of hba->shutting_down from inside sysfs
>   attribute callbacks, remove sysfs attributes before starting shutdown.
>   That will remove the need to check hba->shutting_down from inside
>   sysfs attribute callbacks.
> - Leave out the host_sem down() and up() calls from ufshcd_wl_suspend()
>   and ufshcd_wl_resume(). Serializing hibernation against e.g. sysfs
>   attribute access is not the responsibility of a SCSI LLD - this is the
>   responsibility of the power management core.
> - Split host_sem. I don't see how else to address the potential deadlock
>   between the error handler and runtime resume.
> 
> Do you agree with the above?

Looking some more:

sysfs and debugfs use direct access, so there is probably not a problem
there.

bsg also uses direct access but doesn't appear to have synchronization
so there is maybe a gap there.  That is an existing problem.

As an aside, the current synchronization for direct access doesn't make
complete sense because the lock (host_sem) remains held across retries
(e.g. ufshcd_query_descriptor_retry) preventing error handling between
retries.  That is an existing problem.

ufshcd_wl_suspend() and ufshcd_wl_shutdown() could wait for error handling
and then disable it somehow. ufshcd_wl_resume() would have to enable it.

That leaves runtime PM.  Since the error handler can block runtime resume,
it cannot wait for runtime resume, it must exit.  Another complication is
that the PM workqueue (pm_wq) gets frozen early during system suspend, so
requesting an asynchronous runtime resume won't necessarily make any
progress.

How does splitting the host_sem address the potential deadlock
between the error handler and runtime resume?