On 12/20/22 00:55, John Garry wrote: > On 19/12/2022 15:28, Jason Yan wrote: >>>> + if (test_bit(SAS_DEV_GONE, &dev->state) && dev_is_sata(dev)) >>>> + sas_ata_device_link_abort(dev, false); >>> >>> Firstly, I think that there is a bug in sas_ata_device_link_abort() -> >>> ata_link_abort() code in that the host lock in not grabbed, as the >>> comment in ata_port_abort() mentions. Having said that, libsas had >>> already some dodgy host locking usage - specifically dropping the lock >>> for the queuing path (that's something else to be fixed up ... I think >> >> Taking big locks in queuing path is not a good idea. This will bring >> down performance. > > But it is expected that ata_qc_issue() should be called with that the > host lock grabbed (and keep it). > > I think that the reason libsas drops the lock is because some LLDD > queuecommand CBs calls task_done() in some error paths. If we kept the > lock held, then we could have a deadlock, for example: > > sas_ata_qc_issue (has lock) -> lldd_execute_task() = > pm8001_queue_command() -> task_done() = sas_ata_task_done() -> grab host > lock => deadlock. That should be easily solvable using a workqueue for doing task_done(), no ? > > Thanks, > John -- Damien Le Moal Western Digital Research