Re: [PATCH 6/6] ufs: Remove the SCSI timeout handler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2020-05-28 02:47, Can Guo wrote:
> Hi Bart,
> 
> On 2019-12-25 06:02, Bart Van Assche wrote:
>> The UFS SCSI timeout handler was needed to compensate that
>> ufshcd_queuecommand() could return SCSI_MLQUEUE_HOST_BUSY for a long
>> time. Commit a276c19e3e98 ("scsi: ufs: Avoid busy-waiting by eliminating
>> tag conflicts") fixed this so the timeout handler is no longer necessary.
>>
>> See also commit f550c65b543b ("scsi: ufs: implement scsi host timeout
>> handler").
>>
> 
> Sorry for bugging you on this old change. I am afraid we may need to add
> this timeout handler back. Because there is till chances that a request
> gets stuck somewhere in ufshcd_queuecommand() path before
> ufshcd_send_command() gets called. e.g.
> 
> ufshcd_queuecommand()
> ->ufshcd_map_sg()
> -->scsi_dma_map()
> --->dma_map_sg()
> ---->dev->ops->map_sg()
> 
> map_sg() ops may get stuck. map_sg() method can vary on different platforms
> based on actual IOMMU engines. We cannot gaurantee map_sg() ops must return
> immediately as we don't know what is actually inside map_sg() ops.
> 
> And if it gets stuck there for a long time till the request times out,
> without
> the UFS timeout handler, scsi layer will try to abort this request from UFS
> driver by calling ufshcd_abort() eventually. ufshcd_abort() will think this
> request has been completed due to its tag is not in hba->outstanding_reqs
> or UFS host's door bell reg. However, actually, this request is still in
> ufshcd_queuecommand() path. I don't need to continue on the subsequent
> impact
> to UFS driver if ufshcd_abort() happens in this case. This is a corner
> case,
> but it is still possible (I did see map_sg() ops hangs on real devices).
> 
> Having the UFS timeout handler back will prevent this situation as UFS
> timeout
> handler checks if the tag is in hba->outstanding_reqs (for our case, it
> is not
> in there), if no, it returns BLK_EH_RESET_TIMER so that scsi/block layer
> will
> keep waiting.
> 
> What do you think? Please let me know your ideas on this, thanks!

Hi Can,

I see the following issues with the above proposal:
- Although I haven't been able to find explicit documentation of this, I
  think that dma_map_sg() must not sleep. If it would sleep that would
  break most block and SCSI drivers because many of these drivers call
  dma_map_sg() from their .queue_rq() or .queuecommand() implementation
  and if BLK_MQ_F_BLOCKING has not been set these functions must not
  sleep.
- A timeout handler must not be invoked while .queuecommand() is still
  in progress. The SCSI core calls blk_mq_start_request() before it
  calls ufshcd_queuecommand(). The blk_mq_start_request() activates the
  block layer timeout mechanism. ufshcd_queuecommand() must have
  finished before the block layer timeout handler is activated.

Please fix the root cause, namely the map_sg implementation that may get
stuck.

Thanks,

Bart.



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux