On Mon, Jun 27, 2011 at 8:59 AM, Douglas Gilbert <dgilbert@xxxxxxxxxxxx> wrote: > On 11-06-27 10:34 AM, Hannes Reinecke wrote: >> >> On 06/24/2011 09:48 PM, Dan Williams wrote: >>> >>> From: Jeff Skirvin<jeffrey.d.skirvin@xxxxxxxxx> >>> >>> When resetting a sata device in the domain we have seen occasions where >>> libsas prematurely marks a device gone in the time it takes for the >>> device to re-establish the link. This plays badly with software raid >>> arrays. Other libsas drivers have non-uniform delays in their reset >>> handlers to try to cover this condition, but not sufficient to close the >>> hole. Given that a sata device can take many seconds to recover we >>> filter bcns and poll for the device reattach state before notifying >>> libsas that the port needs the domain to be rediscovered. Once this has >>> been proven out at the lldd level we can think about uplevelling this >>> feature to a common implementation in libsas. >>> >> That's the second time something like this have come up now. >> Wouldn't it makes sense to implement something like the dev_loss_tmo >> mechanism >> with have for FC? That should cover this situation nicely ... > > "NOTE 112 - An STP initiator port should retry connection > requests for at least the time indicated by the STP > SMP I_T NEXUS LOSS TIME field in the SMP REPORT GENERAL > response for the STP target port to which it is trying to > establish a connection." [spl2r01.pdf 9.4.3.18 page 612] > > The recommended value for that field is 2000 (i.e. 2 seconds). > So 2 seconds is not enough in some circumstances? I believe we have seen longer, but the other concern is that the scsi_eh kthread has no coupling with the libsas discovery thread (host workqueue). So, for example, the 2 second wait that mvsas performs in mvs_debug_I_T_nexus_reset() (that I assume is for this purpose) does not preclude bcns from being processed during that time. -- Dan -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html