On Sun, Apr 22, 2012 at 10:30 AM, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: > On Fri, 2012-04-13 at 16:37 -0700, Dan Williams wrote: >> When managing shost->host_eh_scheduled libata assumes that there is a >> 1:1 shost-to-ata_port relationship. libsas creates a 1:N relationship >> so it needs to manage host_eh_scheduled cumulatively at the host level. >> The sched_eh and end_eh port port ops allow libsas to track when domain >> devices enter/leave the "eh-pending" state under ha->lock (previously >> named ha->state_lock, but it is no longer just a lock for ha->state >> changes). >> >> Since host_eh_scheduled indicates eh without backing commands pinning >> the device it can be deallocated at any time. Move the taking of the >> domain_device reference under the port_lock to guarantee that the >> ata_port stays around for the duration of eh. > >> Cc: Tejun Heo <tj@xxxxxxxxxx> >> Acked-by: Jacek Danecki <jacek.danecki@xxxxxxxxx> > > Could we standardise on Acked-by, please. In my book it means the > maintainer of a piece of code agrees with the change and lets me take it > through my tree. I'm aware that not everyone uses this definition, so > we can use a different standard from my current one, but what does it > mean in this case? > >> Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> >> --- >> drivers/ata/libata-core.c | 4 ++ >> drivers/ata/libata-eh.c | 57 ++++++++++++++++++++++++++++------- >> drivers/scsi/libsas/sas_ata.c | 38 +++++++++++++++++++++-- >> drivers/scsi/libsas/sas_discover.c | 6 ++-- >> drivers/scsi/libsas/sas_event.c | 12 ++++--- >> drivers/scsi/libsas/sas_init.c | 14 ++++----- >> drivers/scsi/libsas/sas_scsi_host.c | 27 +++++++++++++---- >> include/linux/libata.h | 4 ++ >> include/scsi/libsas.h | 4 ++ >> include/scsi/sas_ata.h | 5 +++ >> 10 files changed, 134 insertions(+), 37 deletions(-) > > This is a pretty big change for rc fixes. None of the other changes in > the series seem to be dependent on it, what bug is it actually fixing? It fixes two bugs (which in hindsight could potentially be split into two patches). The major one is guarantees about when host_eh_scheduled is cleared. Prior to this patch when at least one ata_port in a domain has completed eh the flag for host is cleared. This patch plus "scsi: fix eh wakeup (scsi_schedule_eh vs scsi_restart_operations)" fixes up deadlocks (waiting indefinitely for eh to complete) and cases where eh terminates too early (host_eh_scheduled cleared with work pending) in our hot plug tests. The other bug this uncovered is the fact that libsas makes use of the port and rphy object after libata has completed it's work, so this patch also moved the taking of the domain_device reference to be under spin_lock_irq(&sas_ha->phy_port_lock) and spin_lock(&port->dev_list_lock). Otherwise, if no commands are pending for the device libsas can proceed directly to freeing the domain_device before the requested eh runs. -- Dan -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html