I've no idea how this is happening, but it appears that it's possible to read corrupted data over an eSATA link. The setup is a Solid-run Cubox-i4pro connected to an external eSATA 2.5" enclosure with a Corsair Neutron 128GB SSD. While the issue seems to be the generally poor quality of eSATA cables (I have two eSATA enclosures, the other enclosure's eSATA cable interferes with a Logitech wireless receiver - I've had to wrap the cable in aluminium foil.) However, it shouldn't be possible to successfully read faulty data in that condition - and this is what I find most worrying. What I've noticed is that at boot, the the SSD is sometimes properly detected, other times it isn't: [reboot at 10am] ata1: softreset failed (device not ready) ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ATA-8: Corsair Neutron SSD, M311, max UDMA/133 ata1.00: 250069680 sectors, multi 1: LBA48 NCQ (depth 31/32) ata1.00: configured for UDMA/133 scsi 0:0:0:0: Direct-Access ATA Corsair Neutron M311 PQ: 0 ANSI: 5 sd 0:0:0:0: [sda] 250069680 512-byte logical blocks: (128 GB/119 GiB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO sda: sda1 sda2 sda3 sd 0:0:0:0: [sda] Attached SCSI disk [half an hour passes and a reboot later] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ATA-8: , , max UDMA/133 ata1.00: 250069680 sectors, multi 1: LBA48 ata1.00: configured for UDMA/133 scsi 0:0:0:0: Direct-Access ATA n/a PQ: 0 ANSI: 5 sd 0:0:0:0: [sda] 250069680 512-byte logical blocks: (128 GB/119 GiB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO sd 0:0:0:0: [sda] Attached SCSI disk [replugging the eSATA connector] ata1: exception Emask 0x10 SAct 0x0 SErr 0x4050002 action 0xe frozen ata1: irq_stat 0x00000040, connection status changed ata1: SError: { RecovComm PHYRdyChg CommWake DevExch } ata1: hard resetting link ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ATA-8: Corsair Neutron SSD, M311, max UDMA/133 ata1.00: 250069680 sectors, multi 1: LBA48 NCQ (depth 31/32) ata1.00: configured for UDMA/133 ata1: EH complete scsi 0:0:0:0: Direct-Access ATA Corsair Neutron M311 PQ: 0 ANSI: 5 sd 0:0:0:0: [sda] 250069680 512-byte logical blocks: (128 GB/119 GiB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO sda: sda1 sda2 sda3 sd 0:0:0:0: [sda] Attached SCSI disk [another reboot] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ATA-8: Corsair Neutron SSD, M311, max UDMA/133 ata1.00: 250069680 sectors, multi 1: LBA48 NCQ (depth 31/32) ata1.00: configured for UDMA/133 scsi 0:0:0:0: Direct-Access ATA Corsair Neutron M311 PQ: 0 ANSI: 5 sd 0:0:0:0: [sda] 250069680 512-byte logical blocks: (128 GB/119 GiB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO sda: sda1 sda2 sda3 sd 0:0:0:0: [sda] Attached SCSI disk When the SSD is mis-detected, as can be seen above, the partition table isn't recognised. Reading sector 0 shows: 00000000 18 f0 9f e5 18 f0 9f e5 18 f0 9f e5 18 f0 9f e5 |................| 00000010 18 f0 9f e5 00 00 a0 e1 14 f0 9f e5 14 f0 9f e5 |................| 00000020 f8 09 00 00 3c 00 00 00 40 00 00 00 1c f2 00 00 |....<...@.......| 00000030 04 f2 00 00 5c 12 00 00 90 12 00 00 fe ff ff ea |....\...........| 00000040 fe ff ff ea 00 00 00 ea da 0d 00 ea 28 00 8f e2 |............(...| 00000050 00 0c 90 e8 00 a0 8a e0 01 70 4a e2 00 b0 8b e0 |.........pJ.....| which is some ARM code - it looks like early boot code, code which would normally be found in the vector page. What should be there is the partition table: 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000001b0 00 00 00 00 00 00 00 00 bf 64 9b 0b 00 00 00 20 |.........d..... | 000001c0 21 00 83 53 09 0a 00 08 00 00 00 80 02 00 00 53 |!..S...........S| 000001d0 0a 0a 83 a8 0a 1e 00 88 02 00 00 00 00 01 00 a8 |................| 000001e0 0b 1e 83 1d 3f ce 00 88 02 01 b0 3a e5 0d 00 00 |....?......:....| 000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa |..............U.| The SSD has had no ARM executables copied on to it since it was purchased. It contains a DOS partition and swap space, so it's possible that the swap space could have had ARM code swapped to it. Grepping the swap partition for the initial sequence of 8 bytes (18 f0 9f e5 18 f0 9f e5) gives nothing. Another possibility is we didn't actually read any data from the SSD at all but ended up copying it from somewhere else. As I say above, it looks like what one would expect to find in the ARM vector page. It doesn't tie up with the iMX6 ROM nor the uboot image in the SD card. It's not the kernel's vectors either. It could be the SSD firmware (I haven't checked, but I wouldn't be surprised if the SSD has an ARM CPU on it) but that would point towards a very weird failure of the SSD - though not if it's receiving corrupted commands and there was no verification of those commands. Without a firmware image of the SSD (which isn't going to be possible to get hold of) it's impossible to know. What concerns me is that this incorrect data was successfully read allegedly from the device. What happens if a write were to occur - what would we be writing to? Also, clearly the identify information is definitely screwed, although some of it does seem to be correct, such as the capacity, but the device identifiers are broken. How does SATA ensure command and data integrity over the link? I'd assume that there's a CRC present on the data, like UDMA on PATA. How are CRC errors supposed to be reported? Is it possible that ahci_imx and other layers are not properly checking for CRC errors? Any ideas what to look at? Anyone got any suggestions on where to get a good quality, but not stupidly expensive eSATA cable from? I'm waiting for it to happen again, and I'll dump out more of the drive's "contents" when the cable is bad. If it is the drive's firmware, it should contain the manufacturer name/model somewhere in the image. -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html