Hi Can, On Fri, 2020-07-31 at 16:58 +0800, Can Guo wrote: > Hi Stanley, > > On 2020-07-31 16:22, Stanley Chu wrote: > > Hi Can, > > > > On Mon, 2020-07-27 at 15:30 +0800, Can Guo wrote: > >> Hi Stanley, > >> > >> On 2020-07-24 22:01, Stanley Chu wrote: > >> > Currently I/O request could be still submitted to UFS device while > >> > UFS is working on shutdown flow. This may lead to racing as below > >> > scenarios and finally system may crash due to unclocked register > >> > accesses. > >> > > >> > To fix this kind of issues, specifically quiesce all SCSI devices > >> > before UFS shutdown to block all I/O request sending from block > >> > layer. > >> > > >> > Example of racing scenario: While UFS device is runtime-suspended > >> > > >> > Thread #1: Executing UFS shutdown flow, e.g., > >> > ufshcd_suspend(UFS_SHUTDOWN_PM) > >> > Thread #2: Executing runtime resume flow triggered by I/O request, > >> > e.g., ufshcd_resume(UFS_RUNTIME_PM) > >> > > >> > >> I don't quite get it, how can you prevent block layer PM from iniating > >> hba runtime resume by quiescing the scsi devices? Block layer PM > >> iniates hba async runtime resume in blk_queue_enter(). But quiescing > >> the scsi devices can only prevent general I/O requests from passing > >> through scsi_queue_rq() callback. > >> > >> Say hba is runtime suspended, if an I/O request to sda is sent from > >> block layer (sda must be runtime suspended as well at this time), > >> blk_queue_enter() initiates async runtime resume for sda. But since > >> sda's parents are also runtime suspended, the RPM framework shall do > >> runtime resume to the devices in the sequence hba->host->target->sda. > >> In this case, ufshcd_resume() still runs concurrently, no? > >> > > > > You are right. This patch can not fix the case you mentioned. It just > > prevents "general I/O requests". > > > > So perhaps we also need below patch? > > > > #2 scsi: ufs: Use pm_runtime_get_sync in shutdown flow > > https://patchwork.kernel.org/patch/10964097/ > > That is what I am talking about, we definitely need this. Since > you are already working on the fixes to the shutdown path, I will > not upload my fixes (basically look same with yours). However, as > regard for the new change, if pm_runtime_get_sync(hba->dev) < 0, > hba can still be runtime ACTIVE, why directly goto out without a > check of hba's runtime status? > Thanks for reminding this. Then I will fix it and resend both patches as a new series to fix the shutdown path. Thanks so much, Stanley Chu