James Bottomley wrote: > On Tue, 2006-06-13 at 14:37 -0500, Michael Reed wrote: >> Not really true as the transport holds off the error handler until the >> transport dev loss timer expires. >> >> And afterwards, commands are returned immediately with DID_NO_CONNECT. >> The device is never offlined (with my patch applied). > > That was just a general examination of the options for retaining contact > with the target. > > It seems we both agree that returning an error is about the only viable > option, in which case the user or application has to take a recovery > action anyway, so there's no logical difference between what you propose > and what we currently do as far as the application or filesystem is > concerned. > > The only difference is what happens if the device reappears. However, > since the application has to be modified in either case: your patch to > continually probe with I/O to see if the device has returned, I'm not suggesting that any application would probe with i/o, though it may or may not be doing that today. If it is, the difference is that the i/o will have the possibility of success when the target ultimately returns. With the current code, the i/o will never, ever, succeed. (Without app change, of course.) or the > existing case to wait out the udev event that says the device is back it > doesn't really buy us anything for the application. BTW, I consider "application" to include kernel code such as volume managers and file systems. The applications don't require any modifications with the new patch. They still get failure notification in either case. They still fail to work while the target is disconnected. They can choose to terminate or not. > Since the rest of > our infrastructure is already event driven, or migrating that way, I > really don't see value in introducing an anomaly like this purely for > fibre channel. It's tough on fibre channel, being first. :) Among the benefits of this patch is the purchase of time. With the fc infrastructure the way it is, you're assured of forcing developers to "publish or perish". That may be the intended desire. It just doesn't seem fair to the users who have to deal with this. It makes sense to me to implement the event driven infrastructure in such a way that it's more complete when released. If infrastructure is going to be removed, then "applications" have to be adjusted to accommodate this. It shouldn't be, oh by the way, your driver/app is now broken, hurry up and fix it or your users will complain. [End Of Rant]. My patch buys time. Change the default so that the remove on disconnect has to be consciously overridden. Remove the variable when the supporting infrastructure is in place. Put out a message indicating that the option of not removing the infrastructure is "going away" in a future release. Provide an orderly transition. Insure domestic tranquility. Promote the general welfare. :) I'm happy to adjust the patch to accommodate any of these suggestions if they are deemed acceptable. Thanks for taking the time to consider and discuss this issue. I see your point and I've made mine. I trust your judgment. Thanks, Mike > > James > > - : send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html