On Mon, Jul 27, 2015 at 8:17 AM, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: > On Wed, 2015-07-22 at 14:08 -0700, Dan Williams wrote: >> On Wed, Jul 22, 2015 at 11:28 AM, James Bottomley >> <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: >> > >> >> } >> >> >> >> void sas_device_set_phy(struct domain_device *dev, struct sas_port *port) >> >> diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c >> >> index d3c5297c6c89..9a25ae3a52a4 100644 >> >> --- a/drivers/scsi/libsas/sas_port.c >> >> +++ b/drivers/scsi/libsas/sas_port.c >> >> @@ -219,7 +219,6 @@ void sas_deform_port(struct asd_sas_phy *phy, int gone) >> >> >> >> if (port->num_phys == 1) { >> >> sas_unregister_domain_devices(port, gone); >> >> - sas_port_delete(port->port); >> >> port->port = NULL; >> >> } else { >> >> sas_port_delete_phy(port->port, phy->phy); >> >> >> > >> > This should become >> > >> > if (port->num_phys == 1) >> > sas_unregister_domain_device(port, gone); >> > >> > sas_port_delete_phy(port->port, phy->phy); >> > >> > So we end up with a port scheduled for destruction with no phys rather >> > than making the last phy association hang around until the DISCE >> > workqueue runs. >> >> Sounds ok in theory. > > It's not really a choice. The specific problem you've introduced with > this patch is failure to cope with link flutter: a deform and form event > queued sequentially. In the new scheme you're trying to introduce, the > destruct event gets queued from the deform but behind the form and the > link flutter results in a dead link. I thought just forcing a zero phy > port would fix this, but it won't, either the destruct has to run in the > context of the deform event or the form has to be queued later than the > destruct. I think coupled with the changes above, there needs to be > > if (port->port) { > /* dying port, requeue form event */ > resend the PORTE_BYTES_DMAED event > return > } > > inside the unmatched port loop in sas_port_form() if nothing is found as > well to close this. I think it's too late. Once the lldd has triggered libsas to start tear down I seem to recall the lldd has the expectation that a new PORTE_BYTES_DMAED triggers the creation of a new port instance for that phy. Once the flutter reaches libsas the race is already lost and the port needs to be torn down, but I would need to take a closer look. >> I don't have a libsas environment handy, I worked with Praveen to >> validate the version as submitted if you want to re-work it. > > A couple of days ago, this was so urgent as to have to go outside the > usual patch process ... now it's not important enough for you to bother > working on it; which is it? Neither, it was a reviewed patch that was idling in the process. I'm still of the opinion that pinging Andrew in a case like this *is* the expected process, unless there's a place I can check that a patch is still in the application queue? -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html