Hello, Denys Dmytriyenko wrote: >> Hmmm... This is first. Which driver is it? It means that controller is >> reporting that NCQ command tags which are not issued (or already >> completed) are in-flight. Due to the way hdd reports NCQ command >> completion, it's not possible for the drive to cause this. This gotta >> be a bug on the host side (be it controller chip or more likely the >> driver). The command tag in question is 5. Only 0, 3 and 4 were in flight. > > It is sata_sil24 on 2.6.23.9. If there were related fixes in the recent > versions, I can retest it. No, not that I know of. >> This one is different. The drive reported device error but the driver >> couldn't get more information about the error (log page 10h contains >> it). What does smartctl -a on the drive say? > > # smartctl -a /dev/sdc > smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen > Home page is http://smartmontools.sourceforge.net/ > > 9 Power_On_Hours 0x0032 242 242 000 Old_age Always - 3941 Okay, power on hours is 3941. > Error 42 occurred at disk power-on lifetime: 3444 hours (143 days + 12 hours) > When the command that caused the error occurred, the device was in an unknown state. > > After command completion occurred, registers were: > ER ST SC SN CL CH DH > -- -- -- -- -- -- -- > 84 41 28 ff 46 5a 40 > > Commands leading to the command that caused the error were: > CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name > -- -- -- -- -- -- -- -- ---------------- -------------------- > 60 08 28 ff 46 5a 40 00 2d+07:38:11.073 READ FPDMA QUEUED > 60 08 28 ff 46 5a 40 00 2d+07:38:11.073 READ FPDMA QUEUED > 60 08 28 ff 46 5a 40 00 2d+07:38:11.073 READ FPDMA QUEUED > 60 10 20 2f 47 5a 40 00 2d+07:38:11.073 READ FPDMA QUEUED > 60 08 18 1f 47 5a 40 00 2d+07:38:11.073 READ FPDMA QUEUED Error 42 occurred about 21days ago. Unless your clock is off, I don't think this is what you've seen but the error is UNC (uncorrectable media error), so it does mean that your drive has some bad sectors which can explain the device error you saw. > Error 41 occurred at disk power-on lifetime: 3405 hours (141 days + 21 hours) > When the command that caused the error occurred, the device was in an unknown state. > > After command completion occurred, registers were: > ER ST SC SN CL CH DH > -- -- -- -- -- -- -- > 00 41 01 10 00 00 a0 Error: > > Commands leading to the command that caused the error were: > CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name > -- -- -- -- -- -- -- -- ---------------- -------------------- > 2f 00 01 10 00 00 a0 00 12:51:00.112 READ LOG EXT > 60 20 20 7f 32 4c 40 00 12:51:00.081 READ FPDMA QUEUED > 60 08 18 6f 32 4c 40 00 12:51:00.081 READ FPDMA QUEUED > 60 30 10 9f 32 4c 40 00 12:51:00.081 READ FPDMA QUEUED > 60 08 08 5f 32 4c 40 00 12:51:00.081 READ FPDMA QUEUED Hmm.. this one less clear. Maybe the device wasn't expecting READ LOG EXT as it was still in NCQ command phase and got surprised? Currently you're the first and only one to report illegal qc_active transition problem. I'd like to know what precedes the error which isn't exactly easy in retrospect. For now, please keep an eye on those errors and report if you can see any pattern. And just in case, can you get 2.6.24 on the machine and see anything changes? -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html