On 07/10/2017 09:06 AM, Yijing Wang wrote: > Disco mutex was introudced to prevent domain rediscovery competing > with ata error handling(87c8331). If we have already hold the lock > in sas_revalidate_domain and sync executing probe, deadlock caused, > because, sas_probe_sata() also need hold disco_mutex. Since disco mutex > use to prevent revalidata domain happen during ata error handler, > it should be safe to release disco mutex when sync probe, because > no new revalidate domain event would be process until the sync return, > and the current sas revalidate domain finish. > > Signed-off-by: Yijing Wang <wangyijing@xxxxxxxxxx> > CC: John Garry <john.garry@xxxxxxxxxx> > CC: Johannes Thumshirn <jthumshirn@xxxxxxx> > CC: Ewan Milne <emilne@xxxxxxxxxx> > CC: Christoph Hellwig <hch@xxxxxx> > CC: Tomas Henzl <thenzl@xxxxxxxxxx> > CC: Dan Williams <dan.j.williams@xxxxxxxxx> > --- > drivers/scsi/libsas/sas_expander.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c > index 9d26c28..077024e 100644 > --- a/drivers/scsi/libsas/sas_expander.c > +++ b/drivers/scsi/libsas/sas_expander.c > @@ -776,6 +776,7 @@ static struct domain_device *sas_ex_discover_end_dev( > struct ex_phy *phy = &parent_ex->ex_phy[phy_id]; > struct domain_device *child = NULL; > struct sas_rphy *rphy; > + bool prev_lock; > int res; > > if (phy->attached_sata_host || phy->attached_sata_ps) > @@ -803,6 +804,7 @@ static struct domain_device *sas_ex_discover_end_dev( > sas_ex_get_linkrate(parent, child, phy); > sas_device_set_phy(child, phy->port); > > + prev_lock = mutex_is_locked(&child->port->ha->disco_mutex); > #ifdef CONFIG_SCSI_SAS_ATA > if ((phy->attached_tproto & SAS_PROTOCOL_STP) || phy->attached_sata_dev) { > res = sas_get_ata_info(child, phy); > @@ -832,7 +834,11 @@ static struct domain_device *sas_ex_discover_end_dev( > SAS_ADDR(parent->sas_addr), phy_id, res); > goto out_list_del; > } > + if (prev_lock) > + mutex_unlock(&child->port->ha->disco_mutex); > sas_disc_wait_completion(child->port, DISCE_PROBE); > + if (prev_lock) > + mutex_lock(&child->port->ha->disco_mutex); > > } else > #endif > @@ -861,7 +867,11 @@ static struct domain_device *sas_ex_discover_end_dev( > SAS_ADDR(parent->sas_addr), phy_id, res); > goto out_list_del; > } > + if (prev_lock) > + mutex_unlock(&child->port->ha->disco_mutex); > sas_disc_wait_completion(child->port, DISCE_PROBE); > + if (prev_lock) > + mutex_lock(&child->port->ha->disco_mutex); > } else { > SAS_DPRINTK("target proto 0x%x at %016llx:0x%x not handled\n", > phy->attached_tproto, SAS_ADDR(parent->sas_addr), > I would rather have an analysis if this really cannot happen; 'should not' is rather vague. But seeing that it _is_ quite complex: Reviewed-by: Hannes Reinecke <hare@xxxxxxxx> Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@xxxxxxx +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)