Re: The PQ=1 saga

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I was doing some more testing of this since it has been a while since I 
ran these tests. It looks like reverting this will make the particular situation
that I am worried about even worse. I will put the detail in.

With this in place (before you revert it). When SCSI devices are discovered
and some have a PQ=1 because they are in an unavailable ALUA state:

Jan 27 12:05:29 localhost kernel: scsi 7:0:0:1: scsi scan: peripheral device type of 31, no device added

I don’t know if this intentional with the patch or not but any devices with PQ=1
will not create SCSI devices. The logging is deceptive too since the device type
Is 0 and not 31. In my case I have two paths to LUN 1. One is ALUA AO and the 
other in ALUA unavailable.

With this patch in I only get an sd device and an sg device for the AO path. 
The other path to LUN 1 gets no devices created because it is caught in the
If condition logged above.

Because there are no SCSI devices created, when the ALUA state returns
to an active state, a SCSI rescan, which I can trigger from the target will result
in the devices getting created since the initial scan never created devices.

Jan 27 12:26:04 localhost kernel: scsi 7:0:0:1: scsi scan: INQUIRY pass 1 length 36
Jan 27 12:26:04 localhost kernel: scsi 7:0:0:1: scsi scan: INQUIRY successful with code 0x0
Jan 27 12:26:04 localhost kernel: scsi 7:0:0:1: scsi scan: INQUIRY pass 2 length 96
Jan 27 12:26:04 localhost kernel: scsi 7:0:0:1: scsi scan: INQUIRY successful with code 0x0
Jan 27 12:26:04 localhost kernel: scsi 7:0:0:1: Direct-Access     PURE     FlashArray       8888 PQ: 0 ANSI: 6

Things are good with both paths to LUN 1 showing up. It is not optimal since the
target has to trigger a LUN scan on the initiator affecting all paths to those target
ports.

With the revert of this, things are a little different, but the way they had been in
the past.

Jan 27 13:41:19 localhost kernel: sd 7:0:1:1: Asymmetric access state changed
Jan 27 13:41:56 localhost kernel: scsi 7:0:1:1: alua: Detached
Jan 27 13:42:22 localhost kernel: scsi 7:0:1:1: scsi scan: INQUIRY pass 1 length 36
Jan 27 13:42:22 localhost kernel: scsi 7:0:1:1: scsi scan: INQUIRY successful with code 0x0
Jan 27 13:42:22 localhost kernel: scsi 7:0:1:1: scsi scan: INQUIRY pass 2 length 96
Jan 27 13:42:22 localhost kernel: scsi 7:0:1:1: scsi scan: INQUIRY successful with code 0x0
Jan 27 13:42:22 localhost kernel: scsi 7:0:1:1: Direct-Access     PURE     FlashArray       8888 PQ: 1 ANSI: 6
Jan 27 13:42:22 localhost kernel: scsi 7:0:1:1: alua: supports implicit TPGS
Jan 27 13:42:22 localhost kernel: scsi 7:0:1:1: alua: device naa.624a9370acc31b042de141460001141c port group 0 rel port a
Jan 27 13:42:22 localhost kernel: scsi 7:0:1:1: Attached scsi generic sg7 type 0

Now an sg device is created but not an sd device. This means that there will be
no way for this device to get an sd device created once the ALUA state goes into
an active state.

The same thing done on the target that worked above no longer does:

Jan 27 13:47:48 localhost kernel: scsi 7:0:1:1: scsi scan: device exists on 7:0:1:1

To get around this, the existing disk must be deleted so it is not caught in the rescan
check. This cannot be controlled on the target, but it will require manual intervention
on the initiator.

So the question becomes how should initial scan work when a LUN has a PQ=1 set.
It is a valid, by spec with ALUA state unavailable but doesn’t seem to be
handled. Why allow an sg device but not an sd one on initial scan in this case? There
are probably many ways to fix this. I think the simplest is to allow sd device creation
on LUNs were PQ=1, and only restrict PQ=3. I am not sure the side effect of this on other
targets. The other approach which will no longer work after the revert is to trigger a
rescan from the target. This is sub-optimal since it is disruptive. Any approach involving
the ALUA device handler will not help since there is no device to transition if it is
discovered with PQ=1.

Thanks,
Brian


> On Jan 26, 2023, at 1:01 AM, Hannes Reinecke <hare@xxxxxxx> wrote:
> 
> On 1/25/23 09:33, Martin Wilck wrote:
>> On Tue, 2023-01-24 at 17:41 -0800, Bart Van Assche wrote:
>>> On 1/24/23 16:01, Martin K. Petersen wrote:
>>>> I would like to revert commit 948e922fc446 ("scsi: core: map PQ=1,
>>>> PDT=other values to SCSI_SCAN_TARGET_PRESENT").
>>> 
>>> That sounds good to me.
>>> 
>>> Bart.
>>> 
>> I agree.
> Yep.
> 
> Cheers,
> 
> Hannes
> -- 
> Dr. Hannes Reinecke                Kernel Storage Architect
> hare@xxxxxxx                              +49 911 74053 688
> SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
> HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
> Myers, Andrew McDonald, Martje Boudien Moerman
> 





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux