On 04/05/2013 05:24 PM, James Smart wrote: > > On 4/4/2013 6:17 AM, Hannes Reinecke wrote: >> On 03/31/2013 07:44 PM, Tomas Henzl wrote: >>> What we can do is to decode the LUN and compare it to max_lun provided by the driver, >>> I think that sg_luns is able to do that, so what is needed is just to follow the SAM. >>> >>> I have seen reports of problem on three different drivers connected to various >>> external storage, all of them having the same basic reason - the driver sets a max_lun >>> and then LUN comes encoded with a newer addressing method and something like this is shown >>> 'kernel: scsi: host 2 channel 0 id 2 lun16643 has a LUN larger than allowed by the host adapter' >>> >>> Decoding the real LUN value would fix this problem, by decoding is only meant the use in >>> scsi_report_lun_scan. The LUN would be stored exactly the same way as it is now. >>> I know we can patch the certain drivers too, but when max_lun were what the name says >>> - max LU number, it would fix my problem very easy. >>> >> Errm. >> >> No. Decoding LUNs is _evil_. It has only a relevance on the target, >> and even then it might choose to ignore it. >> So we cannot try to out-guess the target here. OK, I can see the problems with decoding the LUN one of them is the need to again encode the LUN to address format + number. I'm not sure if the hw would work if another address mode were used. When we understand the LUN as a complex structure then it makes no sense to compare to max_lun as a number - http://lxr.linux.no/#linux+v3.8.6/drivers/scsi/scsi_scan.c#L1471 >> The error you're reporting is that lpfc is setting max_luns to >> '255', which of course is less than 16643. Increasing max_luns on >> lpfc to '0xFFFF' will fix your problem; nothing to do with 64-bit >> LUNs ... I think I haven't mentioned lpfc, but it doesn't matter. Fixing this in individual drivers by increasing the max_lun is not easy, because the firmware could have some reasons for the max lun (some tables, ..., fact is I have no idea how this is implemented in the hw). If the fix for this were just to set max_lun to 0xFFFF in every driver it means that we could remove the max_lun and the test completely. A kernel option like 'ignore_max_lun' would help, but I somehow dislike it, what do you think? > The reason lpfc set max_luns to 255 is due to the midlayer using > max_luns as a (SCSI-2 device) max sequential scan loop top value, not > necessarily as a max lun # as what's now in the report luns scan loop. > When we were attached to jbods (loop, etc) - we saw 2 problems: our scan > time dramatically increased (several minutes based on a 16k max_lun > value); and as the jbod only decoded 8 bits - it happened to respond > successfully to any lun value where the lower 8-bits were 0, meaning > lots of midlayer "ghost" devices were created when in reality there was > only 1 lun present. Changing the max_luns value is fine as long as > you know what's attached. > > -- james s > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html