On Fri, Aug 26, 2011 at 08:58:15AM +0200, Tejun Heo wrote: > Hello, > > On Thu, Aug 25, 2011 at 11:40:50PM -0700, Marc MERLIN wrote: > > ata11.15: hard resetting link > > ata11.15: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > > ata11.15: Port Multiplier vendor mismatch '0x1095' != '0x101' > > ata11.15: PMP revalidation failed (errno=-19) > > ata11.15: hard resetting link > > ata11.15: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > > ata11.15: Port Multiplier vendor mismatch '0x1095' != '0x101' > > ata11.15: PMP revalidation failed (errno=-19) > > ata11.15: limiting SATA link speed to 1.5 Gbps > > ata11.15: hard resetting link > > ata11.15: SATA link up 1.5 Gbps (SStatus 113 SControl 310) > > ata11.15: Port Multiplier vendor mismatch '0x1095' != '0x101' > > ata11.15: PMP revalidation failed (errno=-19) > > ata11.15: failed to recover PMP after 5 tries, giving up > > ata11.15: Port Multiplier detaching > > ata11.00: disabled > > ata11.01: disabled > > ata11.02: disabled > > ata11.03: disabled > > ata11.04: disabled > > ata11.00: disabled > > ata11: hard resetting link > > ata11: SATA link up 3.0 Gbps (SStatus 123 SControl 300) > > ata11.15: Port Multiplier <unknown>, 0x0101:0x9669 r1, 1 ports, feat 0x96690101/0x96690101 > > Either the controller or port multiplier (more likely the controller) > got completely confused. It's basically reporting random garbage for > the identification data for the port multiplier. I don't know what > went on there but probably the controller needed a strong kick in the > butt to come back into sane state. Anyways, there isn't much the port > multiplier layer can do if the controller is reporting garbage for > data read from PMP. Understood. I'm glad you could read those logs better than I could :) For that it's worth, the raid is still rebuilding 12H later and while I was initially getting some error/warning messages on the console last night, they have now stopped: ata11.00: status: { DRDY } ata11.01: exception Emask 0x100 SAct 0x1 SErr 0x0 action 0x6 frozen ata11.01: failed command: READ FPDMA QUEUED ata11.01: cmd 60/08:00:f7:0c:2c/00:00:56:00:00/40 tag 0 ncq 4096 in res 40/00:04:f7:0c:2c/00:00:56:00:00/40 Emask 0x100 (unknown error) ata11.01: status: { DRDY } ata11.02: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen ata11.03: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen ata11.04: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen ata11.05: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen ata11.15: exception Emask 0x100 SAct 0x0 SErr 0x400000 action 0x6 frozen ata11.15: edma_err_cause=40000000 pp_flags=00000007 ata11.15: SError: { Handshk } ata11.00: exception Emask 0x100 SAct 0x2 SErr 0x0 action 0x6 frozen ata11.00: failed command: WRITE FPDMA QUEUED ata11.00: cmd 61/f8:08:17:77:9e/03:00:5f:00:00/40 tag 1 ncq 520192 out res 40/00:04:0f:7b:9e/00:00:5f:00:00/40 Emask 0x100 (unknown error) ata11.00: status: { DRDY } ata11.01: exception Emask 0x100 SAct 0x1 SErr 0x0 action 0x6 frozen ata11.01: failed command: READ FPDMA QUEUED ata11.01: cmd 60/08:00:0f:7b:9e/00:00:5f:00:00/40 tag 0 ncq 4096 in res 40/00:04:0f:7b:9e/00:00:5f:00:00/40 Emask 0x100 (unknown error) ata11.01: status: { DRDY } ata11.02: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen ata11.03: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen ata11.04: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen However it looks that after the first errors downgraded my ports to 1.5G, I'm now stuck with a much slower rebuild speed: [=========>...........] recovery = 45.2% (884261384/1953511424) finish=771.1min speed=23109K/sec Once I'm in that state, can I get back to 3G, or do I need to reboot to get there? > Mark, have you seen anything like this? Could it be that the > controller goes out of proper configuration after certain condition > and needs to be reset/reconfigured? I know you meant Mark Lord, but if that helps, it looks like said condition was only reached with a drive that had a genuine error when I tried to do a raid rebuild. Thankfully my system has otherwise been stable so far with a fair amount of IO on those drives / PMP / Card. The upgrade to 3.0.1 probably didn't help much, but it can't hurt to try that either. Thanks for your reply, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html