RE: [PATCH v1 0/2] scsi: ufs: Fix broken hba->outstanding_tasks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Stanley,

> 
> Hi Avri,
> 
> On Mon, 2019-07-22 at 06:10 +0000, Avri Altman wrote:
> > >
> > > >
> > > > Hi,
> > > >
> > > > >
> > > > > Currently bits in hba->outstanding_tasks are cleared only after their
> > > > > corresponding task management commands are successfully done
> by
> > > > > __ufshcd_issue_tm_cmd().
> > > > >
> > > > > If timeout happens in a task management command, its
> corresponding
> > > > > bit in hba->outstanding_tasks will not be cleared until next task
> > > > > management command with the same tag used successfully
> finishes.‧
> > > > ufshcd_clear_tm_cmd is also called as part of ufshcd_err_handler.
> > > > Does this change something in your assumptions?
> > > And BTW there is a specific __clear_bit in __ufshcd_issue_tm_cmd() in
> case
> > > of a TO.
> >
> > Gave it another look -
> > If indeed this bit isn't cleared as part of the error flow that the timeout
> triggers,
> > I think you should relate to ufshcd_clear_tm_cmd specifically in your
> commit log -
> > Because this is the obvious place where the bit cleanup should take
> place.
> >
> > Also the fix should be much more intuitive IMO -
> > Today we do __clear_bit() on success, ufshcd_clear_tm_cmd() on error,
> > And also ufshcd_put_tm_slot() either way?
> >
> > Maybe you can choose a single place to clear it, without any additional
> code?
> 
> ufshcd_clear_tm_cmd() is similar to ufshcd_clear_cmd() which tries to
> write timed-out bit in "clear register". They do not clean bits in both
> outstanding masks.
> 
> Another reason to not to do outstanding bits cleanup in
> ufshcd_clear_tm_cmd() is: if host is unable to clear TM command by
> setting "clear register", the TM command may be still "outstanding" in
> the device. In this case, it may be better to do cleanup after reset is
> done. Cleanup includes bits in hba->outstanding_tasks and
> hba->tm_slots_in_use which is possibly cleaned too early by
> ufshcd_put_tm_slot() if command is timed-out now.
> 
> Referring to error handling flow of hba->outstanding_reqs, all timed-out
> bits will be cleaned by ufshcd_reset_and_restore() =>
> ufshcd_transfer_req_compl() after reset is done. Similar handling for
> hba->outstanding_tasks could be applied, i.e., do cleanup by
> ufshcd_reset_and_restore() => ufshcd_tmc_handler().
Fair enough.  Thank you for the detailed explanation.

> 
> The next thing is what you suggested: How to make the fix more
> intuitive. Patchset v2 will be uploaded for review: It tries to
> "re-factor" cleanup jobs first, and then add fixed flow to make the
> whole patch more readable.
> 
> One more thing, above description and flow is done through UFSHCD SCSI
> error handling routines registered with SCSI Midlayer. For TM command
> timeout happening in bsg path without error handling triggered by SCSI
> layer, we may need to consider to handle those tasks in future patches.
Please do.

Thanks,
Avri





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux