On Wed, 2014-06-11 at 16:33 +0200, Hannes Reinecke wrote: > On 06/11/2014 04:24 PM, James Bottomley wrote: > > On Thu, 2014-06-05 at 09:26 +0200, Hannes Reinecke wrote: > >> REPORT_LUN_SCAN does not report any outstanding unit attention > >> condition as per SAM. However, the target might not be fully > >> initialized at that time, so we might end up getting a > >> default entry (or even a partially filled one). > >> But as we're not able to process the REPORT LUN DATA HAS CHANGED > >> unit attention correctly we'll be missing out some LUNs during > >> startup. > >> So it's better to send a TEST UNIT READY for modern implementations > >> and wait until the unit attention condition goes away. > > > > Are you sure this is a good idea: we just spent ages tuning SCSI init so > > we don't slow systems down. This patch, in the event the array is > > having a power on problem, takes us right back to waiting for init > > again ... basically the busy wait in scsi_test_lun. > > > > Since the array should send us a UA anyway when it's got itself sorted > > out, what's wrong with just processing the report luns data has changed > > condition? > > > Because we can't. > > _If_ we were attempting this we'd run into several issues: > a) Boot will fail, as REPORT LUNs will return 0 LUNs (or just LUN 0). > So the scanning code will assume everything's fine. Booting will > continue, only to figure out that no LUNs are present. > As there is _no_ indication that REPORT LUNs should indeed have > returned an error (only it can't due to SAM) we wouldn't even > now that there _is_ an issue. > (In fact, that's what triggered the patchset in the first place.) > b) Even _if_ we're able so somehow recover from that we will have > to rescan the host and any attached devices. > The only way to do this currently is to _remove_ all devices > from that host and then do a full rescan. > Trying this with any devices which are already part of some > complex setup will become ... interesting. OK, go back to first principles and tell us what the actual problem is, with traces and details. Is this some weird SCSI-3 device with a single LUN that's screwing up report luns ... in which case we can just blacklist it. Or is it boot from an array? > So the easy way out here is indeed just to send a TEST UNIT READY. > And as we're checking for a reasonably SCSI compliance we should > be catching most of the oddballs. I don't object hugely to TUR ... except it binds us to spin up because most devices will respond not ready. I do object to busy waiting in the init thread until we get the right answer. James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html