[RFC] libsas: the trouble with ata resets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Currently libsas has a problem with prematurely dropping sata devices
during recovery.  Libata knows that some devices can take quite a
while to recover from a reset and re-establish the link.  The fact
that sas_ata_hard_reset() ignores its 'deadline'  parameter is
evidence that it ignores the link management aspects of what libata
wants from a ->hardreset() handler.

item1: teach sas_ata_hard_reset() to check that the link came back up.
 For direct attached devices the lldd will need the deadline
parameter, and for expander attached perform smp polling to wait for
the link to come back.

Now, during this time that libata is trying to recover the connection
in the host-eh context libsas will start receiving BCNs in the
host-workqueue context.  In the unfortunate cases libsas may take
removal action on a device that will come back with a bit more time.
While libata-eh is in progress libsas should not take any action on
the ata phys in question..

item2:  flush eh before trying to determine what action to take on a phy.

In the case of libsas not all resets are initiated by the eh process
(the sas transport class can reset a phy directly).  It seems libata
takes care to arrange for user requested resets to occur under the
control of eh, and libsas should do the same.

item3: teach all reset entry points to kick and flush eh for ata devices

A corollary for items 1 and 3 is that there is a difference between
scheduling the reset and performing the reset.
->lldd_I_T_nexus_reset() is currently called twice, once by sas-eh to
manage sas_tasks and again by ata-eh to recover the device.  Likely we
need a new ->lldd_ata_hard_reset() handler that is called by ata-eh,
while ->lldd_I_T_nexus_reset() cleans up the sas_tasks and just
schedules reset on the ata_port.

item4: allow for lldd's to provide a direct ->lldd_ata_hard_reset()
which can be assumed to only be called from ata-eh context.

Any other pain points in reset handling?

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux