Re: Port Multiplier drives not always all found on cold plug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Mark Lord wrote:
Tejun,

Since enabling PMP support in sata_mv, there have been some recent
reports of not all drives being found on a PM.

I can reproduce this here myself, too:

(1) silicon image PM with four drives attached.
(2) power-down the PM.
(3) rmmod sata_mv ; modprobe sata_mv (latest upstream).
(4) plug (power off) PM into sata_mv card (7042 chipset).
(5) power on the PM.
(6) watch only 2-3 of the four drives get found.

If the PM was already powered on in advance, then all drives are found.
sata_mv relies entirely on libata-*.c for PMP resets/enumerations,
so I'm not sure that the problem is in sata_mv.

Any ideas?
..

Mmmm.. one strange thing from the logs.
sata_mv reports "unexpected device interrupt" in a few places,
and then triggers EH to recover from it.
This could be what is causing the PMP enumerations to partially fail.

I wonder why we're getting interrupts when supposedly idle/polling ?
Did something in libata miss reading ata_status to clear an IRQ ?

Here's the original (failing) log:

[ 1222.304368] sata_mv 0000:03:00.0: version 1.20
[ 1222.304796] ACPI: PCI Interrupt 0000:03:00.0[A] -> GSI 19 (level, low) -> IRQ 19
[ 1222.304887] sata_mv 0000:03:00.0: Applying 60X1C0 workarounds to unknown rev
[ 1222.304942] sata_mv 0000:03:00.0: Gen-IIE 32 slots 4 ports SCSI mode IRQ via INTx
[ 1222.304954] PCI: Setting latency timer of device 0000:03:00.0 to 64
[ 1222.307587] scsi14 : sata_mv
[ 1222.307587] scsi15 : sata_mv
[ 1222.307587] scsi16 : sata_mv
[ 1222.307718] scsi17 : sata_mv
[ 1222.307718] ata15: SATA max UDMA/133 mmio m1048576@0xff400000 port 0xff422000 irq 19
[ 1222.307718] ata16: SATA max UDMA/133 mmio m1048576@0xff400000 port 0xff424000 irq 19
[ 1222.307718] ata17: SATA max UDMA/133 mmio m1048576@0xff400000 port 0xff426000 irq 19
[ 1222.307718] ata18: SATA max UDMA/133 mmio m1048576@0xff400000 port 0xff428000 irq 19
[ 1222.849717] ata15: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1222.893000] ata15.00: ATA-7: ST3750640NS, 3.BAF, max UDMA/133
[ 1222.893000] ata15.00: 1465149168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[ 1222.893000] ata15.00: failed to set xfermode (err_mask=0x40)
[ 1222.893000] ata15: failed to recover some devices, retrying in 5 secs
[ 1229.574577] ata15: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1229.621209] ata15.00: failed to set xfermode (err_mask=0x40)
[ 1229.621209] ata15: limiting SATA link speed to 1.5 Gbps
[ 1229.621209] ata15.00: limiting speed to UDMA/133:PIO3
[ 1229.621209] ata15: failed to recover some devices, retrying in 5 secs
[ 1236.859860] ata15: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 1236.896494] ata15.00: failed to set xfermode (err_mask=0x40)
[ 1236.896494] ata15.00: disabled

Note the "unexpected device interrupt while idle" error, which triggers EH:

[ 1237.399858] ata15: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen t4
[ 1237.400393] ata15: unexpected device interrupt while idle
[ 1237.400796] ata15: hard resetting link
[ 1237.901885] ata15: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 1237.901901] ata15: EH complete
[ 1238.385209] ata16: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1238.385183] ata16.00: ATA-8: Hitachi HDP725050GLA360, GM4OA50E, max UDMA/133
[ 1238.385183] ata16.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[ 1238.385183] ata16.00: failed to set xfermode (err_mask=0x40)
[ 1238.385183] ata16: failed to recover some devices, retrying in 5 secs
[ 1245.329460] ata16: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1245.336654] ata16.00: failed to set xfermode (err_mask=0x40)
[ 1245.349426] ata16: limiting SATA link speed to 1.5 Gbps
[ 1245.349430] ata16.00: limiting speed to UDMA/133:PIO3
[ 1245.349433] ata16: failed to recover some devices, retrying in 5 secs
[ 1252.526624] ata16: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 1252.533808] ata16.00: failed to set xfermode (err_mask=0x40)
[ 1252.533864] ata16.00: disabled

Note the "unexpected device interrupt while idle" error, which triggers EH:

[ 1253.176196] ata16: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen t4
[ 1253.176290] ata16: unexpected device interrupt while idle
[ 1253.176349] ata16: hard resetting link
[ 1253.649543] ata16: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 1253.649559] ata16: EH complete
[ 1253.982862] ata17: SATA link down (SStatus 0 SControl 300)
[ 1254.302862] ata18: SATA link down (SStatus 0 SControl 300)
[ 1314.600817] ACPI: PCI interrupt for device 0000:03:00.0 disabled
[ 1316.843654] sata_mv 0000:03:00.0: version 1.20
[ 1316.843654] ACPI: PCI Interrupt 0000:03:00.0[A] -> GSI 19 (level, low) -> IRQ 19
[ 1316.843654] sata_mv 0000:03:00.0: Applying 60X1C0 workarounds to unknown rev
[ 1316.843654] sata_mv 0000:03:00.0: Gen-IIE 32 slots 4 ports SCSI mode IRQ via INTx
[ 1316.843654] PCI: Setting latency timer of device 0000:03:00.0 to 64
[ 1316.843654] scsi18 : sata_mv
[ 1316.845591] scsi19 : sata_mv
[ 1316.847159] scsi20 : sata_mv
[ 1316.848918] scsi21 : sata_mv
[ 1316.848918] ata19: SATA max UDMA/133 mmio m1048576@0xff400000 port 0xff422000 irq 19
[ 1316.848918] ata20: SATA max UDMA/133 mmio m1048576@0xff400000 port 0xff424000 irq 19
[ 1316.848918] ata21: SATA max UDMA/133 mmio m1048576@0xff400000 port 0xff426000 irq 19
[ 1316.848918] ata22: SATA max UDMA/133 mmio m1048576@0xff400000 port 0xff428000 irq 19
[ 1317.319523] ata19: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1317.372879] ata19.00: ATA-7: ST3750640NS, 3.BAF, max UDMA/133
[ 1317.372886] ata19.00: 1465149168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[ 1317.449552] ata19.00: configured for UDMA/133
[ 1318.025102] ata20: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1318.041775] ata20.00: ATA-8: Hitachi HDP725050GLA360, GM4OA50E, max UDMA/133
[ 1318.041781] ata20.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[ 1318.060953] ata20.00: configured for UDMA/133
[ 1318.396766] ata21: SATA link down (SStatus 0 SControl 300)
[ 1318.740937] ata22: SATA link down (SStatus 0 SControl 300)
[ 1318.590882] scsi 18:0:0:0: Direct-Access     ATA      ST3750640NS      3.BA PQ: 0 ANSI: 5
[ 1318.590882] sd 18:0:0:0: [sdb] 1465149168 512-byte hardware sectors (750156 MB)
[ 1318.590882] sd 18:0:0:0: [sdb] Write Protect is off
[ 1318.590882] sd 18:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[ 1318.590882] sd 18:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 1318.590882] sd 18:0:0:0: [sdb] 1465149168 512-byte hardware sectors (750156 MB)
[ 1318.590882] sd 18:0:0:0: [sdb] Write Protect is off
[ 1318.590882] sd 18:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[ 1318.590882] sd 18:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 1318.590882]  sdb: sdb1
[ 1318.764651] sd 18:0:0:0: [sdb] Attached SCSI disk
[ 1318.767985] sd 18:0:0:0: Attached scsi generic sg2 type 0
[ 1318.767985] scsi 19:0:0:0: Direct-Access     ATA      Hitachi HDP72505 GM4O PQ: 0 ANSI: 5
[ 1318.767985] sd 19:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
[ 1318.767985] sd 19:0:0:0: [sdc] Write Protect is off
[ 1318.767985] sd 19:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[ 1318.767985] sd 19:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1318.771070] sd 19:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
[ 1318.771070] sd 19:0:0:0: [sdc] Write Protect is off
[ 1318.771070] sd 19:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[ 1318.771070] sd 19:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1318.771070]  sdc: sdc1
[ 1318.795749] sd 19:0:0:0: [sdc] Attached SCSI disk
[ 1318.795749] sd 19:0:0:0: Attached scsi generic sg3 type 0

Those drives (above) are directly attached to the marvell controller.
Now, below, we power on the silimage PM with four additional drives.

[ 1341.310527] ata22: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen
[ 1341.312815] ata22: edma_err_cause=00000010 pp_flags=00000000, dev connect
[ 1341.313286] ata22: SError: { PHYRdyChg DevExch }
[ 1341.313561] ata22: hard resetting link
[ 1342.852379] ata22: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1342.853584] ata22.15: Port Multiplier 1.1, 0x1095:0x3726 r23, 6 ports, feat 0x1/0x9
[ 1342.855237] ata22.00: hard resetting link
[ 1343.335991] ata22.01: hard resetting link
[ 1344.075666] ata22.02: hard resetting link
[ 1350.898092] ata22.15: link is slow to respond, please be patient (ready=0)
[ 1357.122516] ata22.02: SRST failed (errno=-16)
[ 1357.122961] ata22.02: failed to read SCR 0 (Emask=0x100)
[ 1357.123309] ata22.02: reset failed, giving up
[ 1357.123653] ata22.15: hard resetting link
[ 1357.649204] ata22.15: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1357.650414] ata22.03: hard resetting link
[ 1364.843567] ata22.15: link is slow to respond, please be patient (ready=0)
[ 1367.988474] ata22.04: hard resetting link
[ 1368.461807] ata22.05: hard resetting link
[ 1368.788713] ata22.03: ATA-7: ST3400832AS, 3.03, max UDMA/133
[ 1368.789078] ata22.03: 781422768 sectors, multi 0: LBA48 NCQ (depth 31/32)
[ 1368.802056] ata22.03: configured for UDMA/133
[ 1368.808714] ata22.04: ATA-7: ST3400832AS, 3.03, max UDMA/133
[ 1368.809075] ata22.04: 781422768 sectors, multi 0: LBA48 NCQ (depth 31/32)
[ 1368.822052] ata22.04: configured for UDMA/133
[ 1368.822548] ata22: PMP SError.N set for some ports, repeating recovery
[ 1368.823052] ata22.00: hard resetting link
[ 1373.304201] ata22.00: ATA-7: ST3400832AS, 3.03, max UDMA/133
[ 1373.304574] ata22.00: 781422768 sectors, multi 0: LBA48 NCQ (depth 31/32)
[ 1373.317543] ata22.00: configured for UDMA/133
[ 1373.474220] ata22.03: configured for UDMA/133
[ 1373.488812] ata22.04: configured for UDMA/133

Note the "unexpected device interrupt while polling" error, which triggers EH:

[ 1373.489906] ata22.15: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x1 t4
[ 1373.490384] ata22.15: unexpected device interrupt while polling
[ 1373.495008] ata22: EH complete
[ 1373.495374] scsi 21:0:0:0: Direct-Access     ATA      ST3400832AS      3.03 PQ: 0 ANSI: 5
[ 1373.496005] sd 21:0:0:0: [sdd] 781422768 512-byte hardware sectors (400088 MB)
[ 1373.496048] sd 21:0:0:0: [sdd] Write Protect is off
[ 1373.496052] sd 21:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[ 1373.496127] sd 21:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1373.496418] sd 21:0:0:0: [sdd] 781422768 512-byte hardware sectors (400088 MB)
[ 1373.496459] sd 21:0:0:0: [sdd] Write Protect is off
[ 1373.496463] sd 21:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[ 1373.496539] sd 21:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1373.496568]  sdd: sdd1
[ 1373.516434] sd 21:0:0:0: [sdd] Attached SCSI disk
[ 1373.517411] sd 21:0:0:0: Attached scsi generic sg4 type 0
[ 1373.518343] scsi 21:3:0:0: Direct-Access     ATA      ST3400832AS      3.03 PQ: 0 ANSI: 5
[ 1373.520255] sd 21:3:0:0: [sde] 781422768 512-byte hardware sectors (400088 MB)
[ 1373.520300] sd 21:3:0:0: [sde] Write Protect is off
[ 1373.520306] sd 21:3:0:0: [sde] Mode Sense: 00 3a 00 00
[ 1373.520382] sd 21:3:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1373.520546] sd 21:3:0:0: [sde] 781422768 512-byte hardware sectors (400088 MB)
[ 1373.520590] sd 21:3:0:0: [sde] Write Protect is off
[ 1373.520595] sd 21:3:0:0: [sde] Mode Sense: 00 3a 00 00
[ 1373.520673] sd 21:3:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1373.520703]  sde: sde1
[ 1373.532356] sd 21:3:0:0: [sde] Attached SCSI disk
[ 1373.532356] sd 21:3:0:0: Attached scsi generic sg5 type 0
[ 1373.532393] scsi 21:4:0:0: Direct-Access     ATA      ST3400832AS      3.03 PQ: 0 ANSI: 5
[ 1373.532393] sd 21:4:0:0: [sdf] 781422768 512-byte hardware sectors (400088 MB)
[ 1373.532393] sd 21:4:0:0: [sdf] Write Protect is off
[ 1373.532393] sd 21:4:0:0: [sdf] Mode Sense: 00 3a 00 00
[ 1373.532393] sd 21:4:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1373.532393] sd 21:4:0:0: [sdf] 781422768 512-byte hardware sectors (400088 MB)
[ 1373.532393] sd 21:4:0:0: [sdf] Write Protect is off
[ 1373.532393] sd 21:4:0:0: [sdf] Mode Sense: 00 3a 00 00
[ 1373.532393] sd 21:4:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1373.532423]  sdf: sdf1
[ 1373.563088] sd 21:4:0:0: [sdf] Attached SCSI disk
[ 1373.564928] sd 21:4:0:0: Attached scsi generic sg6 type 0

Only 3/4 drives were found on the PM.

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux