Re: Fwd: Re: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen - "dead" harddisc until reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Robert,

at this time I have not tried other UDMA-modes - the controller is udma133 able, the flashcard is udma66-able and the harddisc is (limited by the 44pin cable) udma44-able. with legacy ATA I also use UDMA66 / UDMA44-modes.

but perhaps this logs are another step in the right direction: my last libata-crash looked like this:

ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
ata2.01: BMDMA stat 0x65
ata2.01: failed command: READ DMA
ata2.01: cmd c8/00:08:9f:03:41/00:00:00:00:00/f6 tag 0 dma 4096 in
        res 00/00:08:9f:03:41/00:00:00:00:00/f6 Emask 0x2 (HSM violation)
ata2: soft resetting link
ata2.00: FORCE: xfer_mask set to udma4
ata2.01: FORCE: xfer_mask set to udma3
ata2.00: configured for UDMA/66
ata2.01: configured for UDMA/44
ata2.00: FORCE: xfer_mask set to udma4
ata2.01: FORCE: xfer_mask set to udma3
ata2.00: configured for UDMA/66
ata2.01: configured for UDMA/44
ata2: EH complete

until yet I have legacy-ata-logs found the look like the same - or at least crap:

hdd: ide_dma_sff_timer_expiry: DMA status (0x61)
hdd: dma_intr: status=0x7f { DriveReady DeviceFault SeekComplete DataRequest CorrectedError Index Error } hdd: dma_intr: error=0x7f { DriveStatusError UncorrectableError SectorIdNotFound TrackZeroNotFound AddrMarkNotFound }, LBAsect=8830587504648, sector=209806663
hdd: possibly failed opcode: 0x25
hdc: DMA disabled
hdd: DMA disabled
ide1: reset: success

the logged LBAsect and also the logged sector are not existent on this drive - but they are neither a harddisc-failure not a filesystem-failure - this must be an ugly bug in this chipset or maybe just a communication-problem between controller and harddisc. I have already changed cabeling, harddisc (three times in the meanwhile! On my actual drive I did already with dd a complete write - there were neither logged from kernel bad sectors nor smart does show any pending sectors or reallocated sectors - the harddisc has no problem), compact flash (also three times, already another manufacturer - the flash is currently two month old, I will not belive that it is damaged, altough i did only read-only tests), memory (altough I've tested it several times with memtest) - if there is a hardware-failure it can only be the IDE-controller which I cannot check.

my idea: libata is not able to handle this issue in a way legacy-ide-driver did - as logged the channel got reset, both drives are from now on in PIO-mode, but i can manually set them to DMA again and it works "as good" as before. with libata I am sure this were another reset-reason. Libata seems to force always UDMA-mode after the reset - is there a possibility to workaround?

genereally the DMA-behaviour is from legacy-IDE much better in my opinion: it's possible to set with hdparm in userspace the DMA-mode. libata des not offer such a possibility, does it? So I have no possibility to control or change the behaviour after boot, I have to hope that the fallback-mechanism is good enough...

also I saw "harmless" IDE-communication-problems:

hdd: ide_dma_sff_timer_expiry: DMA status (0x61)
hdd: DMA timeout error
hdd: dma timeout error: status=0x80 { Busy }
hdd: possibly failed opcode: 0x25
hdc: DMA disabled
hdd: DMA disabled
ide1: reset: success

also after this reset I just enabled DMA, the machine is still running, no reset necessary. What would libata do?

any ideas? i am really desperate in the meanwhile. :'-(

thanks!
Alois

Robert Hancock wrote:
On 06/10/2010 01:52 PM, MadLoisae@xxxxxxx wrote:
Hi there,

actually I am using kernel 2.6.34, up to now I was in every (stable)
release since 2.6.30 affected by this issue.
Today I have reactivated legacy, regrettably deprecated parallel ATA
support and have disabled libata. Its a shame, libata is much faster
(about 20% faster I/O measureable) and more forward-looking, but it is
not a real alternative if it crashes continuous but not reproduceable on
via-chipsets (google for this, the web is filled of this issue!). I
know, via chipsets are not very good, but shouldn't we try to make it
better (or at least best as possible) with newer drivers instead of worse? I hope legacy ATA support won't be removed soon from the kernel sources ...

I like trying out a lot, but if the response is so thin it does not make
fun just looking at the same issue with the same messages again and
again not able to do anything beside looking at it and resetting the box
afterwards ...

Have you tried limiting the speed to UDMA2? If that helps then it could be that the motherboard circuitry, etc. isn't suitable for faster speeds.

Random timeouts are unfortunately quite hard to debug since there's so many problems that can cause them but the symptoms are the same: could be that there was an error on the bus that caused something to stall, an interrupt got lost somehow, etc. Or maybe the timing of device access is somehow different and thus more likely to trigger whatever the cause is. There also seem to be a fair number of bugs in these IDE chipsets that the driver has to work around, could be there is one missing in the libata version..


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux