Nicolas STRANSKY wrote:
Le 05/16/2006 10:11 AM, Tejun Heo a écrit :
Hi,
If you've got some time though, I'd like to see what's really going on.
Can you modify #undef ATA_DEBUT to #define ATA_DEBUG in
include/linux/libata.h and post the kernel messages after issuing above
command? Be warned that it will produce a LOT of messages while booting
if you're using SATA disks for your system, and it can considerably slow
down booting.
Here it is.
I first did a "smartctl -d ata -a -o on /dev/sda" and then a "smartctl
-d ata -a -S on /dev/sda" which are the two commands triggering errors
on this drive with you patch.
BTW the system is running very well with your patch apart from this
smartctl problem.
[CC'ing Albert Lee].
Hello, Albert.
Nicolas reported that when smartd starts kernel complains about HSM
violation and full EH kicks in (reset and all that), which it didn't
used to before the recent libata changes. Upon further examination the
offending part seems to be the HSM_ST handling code of ata_hsm_move().
/* ATA PIO protocol */
if (unlikely((status & ATA_DRQ) == 0)) {
/* handle BSY=0, DRQ=0 as error */
qc->err_mask |= AC_ERR_HSM;
ap->hsm_task_state = HSM_ST_ERR;
goto fsm_start;
}
The above is the first test done on entrance to HSM_ST for non-ATAPI
devices. On startup, smartd issues some obsolete commands (feat: 0xd1
and 0xdb) which use PIO data-in protocol, some drives don't implement
the obsolete command and aborts them (stat: 0x51 err: 0x4), which is the
correct behavior if the drive doesn't implement specific command.
However, the above code triggers and the error is handled as HSM
violation not device abortion.
It seems that HSM_ST needs to handle !DRQ && ERR case before the first
iteration (or maybe it should be pushed into HSM_ST_FIRST?). Does my
analysis make sense?
Thanks.
--
tejun
-
: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html