Re: Fwd: Re: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen - "dead" harddisc until reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Robert,

Robert Hancock wrote:
On Fri, Jun 11, 2010 at 12:04 PM, MadLoisae@xxxxxxx <MadLoisae@xxxxxxx> wrote:
Hello Robert,

at this time I have not tried other UDMA-modes - the controller is udma133
able, the flashcard is udma66-able and the harddisc is (limited by the 44pin
cable) udma44-able. with legacy ATA I also use UDMA66 / UDMA44-modes.

If it's a 40-pin cable, the max is UDMA33, not UDMA44. What happens if
you force UDMA33 on both devices?

yes it's a 40pin-cable to the 2.5" harddisc - i have now limited the speed to it to UDMA33, the CF-card is not attached to a limiting cable so I assume I can use there UDMA66?
With legacy-IDE I never hat problems using UDMA44 on this drive.
but perhaps this logs are another step in the right direction: my last
libata-crash looked like this:

ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
ata2.01: BMDMA stat 0x65
ata2.01: failed command: READ DMA
ata2.01: cmd c8/00:08:9f:03:41/00:00:00:00:00/f6 tag 0 dma 4096 in
       res 00/00:08:9f:03:41/00:00:00:00:00/f6 Emask 0x2 (HSM violation)

This one complained because the bits in the status register read from
the drive don't seem to make any sense (specifically none are set,
when DRDY should be).

ata2: soft resetting link
ata2.00: FORCE: xfer_mask set to udma4
ata2.01: FORCE: xfer_mask set to udma3
ata2.00: configured for UDMA/66
ata2.01: configured for UDMA/44
ata2.00: FORCE: xfer_mask set to udma4
ata2.01: FORCE: xfer_mask set to udma3
ata2.00: configured for UDMA/66
ata2.01: configured for UDMA/44
ata2: EH complete

Does it resume operation after this?
No, the machine was dead - after this messages normally my partitions get mounted ro, ext3/ext4 journaling is aborted and a lot of "bad sectors" are logged in dmesg. Then the only possibility is to power off / power on or use sysrq-trigger to "reboot" it - but not always a console is open so normally I have to power off / on.
until yet I have legacy-ata-logs found the look like the same - or at least
crap:

hdd: ide_dma_sff_timer_expiry: DMA status (0x61)
hdd: dma_intr: status=0x7f { DriveReady DeviceFault SeekComplete DataRequest
CorrectedError Index Error }
hdd: dma_intr: error=0x7f { DriveStatusError UncorrectableError
SectorIdNotFound TrackZeroNotFound AddrMarkNotFound },
LBAsect=8830587504648, sector=209806663
hdd: possibly failed opcode: 0x25
hdc: DMA disabled
hdd: DMA disabled
ide1: reset: success

the logged LBAsect and also the logged sector are not existent on this drive
- but they are neither a harddisc-failure not a filesystem-failure - this
must be an ugly bug in this chipset or maybe just a communication-problem
between controller and harddisc. I have already changed cabeling, harddisc
(three times in the meanwhile! On my actual drive I did already with dd a
complete write - there were neither logged from kernel bad sectors nor smart
does show any pending sectors or reallocated sectors - the harddisc has no
problem), compact flash (also three times, already another manufacturer -
the flash is currently two month old, I will not belive that it is damaged,
altough i did only read-only tests), memory (altough I've tested it several
times with memtest) - if there is a hardware-failure it can only be the
IDE-controller which I cannot check.

my idea: libata is not able to handle this issue in a way legacy-ide-driver
did - as logged the channel got reset, both drives are from now on in
PIO-mode, but i can manually set them to DMA again and it works "as good" as
before. with libata I am sure this were another reset-reason. Libata seems
to force always UDMA-mode after the reset - is there a possibility to
workaround?

genereally the DMA-behaviour is from legacy-IDE much better in my opinion:
it's possible to set with hdparm in userspace the DMA-mode. libata des not
offer such a possibility, does it? So I have no possibility to control or
change the behaviour after boot, I have to hope that the fallback-mechanism
is good enough...

libata doesn't currently offer a mechanism to control the DMA setting
from userspace, no.

It does seem like you're having some rather major communication
problems on the bus - the error below seems to indicate that the DMA
transfer stalled:

Has libata not a fallback-mechanism to speak with the drive again?
Nevertheless I am again on libata with UDMA33 and I am trying if this helps. Thanks.
also I saw "harmless" IDE-communication-problems:

hdd: ide_dma_sff_timer_expiry: DMA status (0x61)
hdd: DMA timeout error
hdd: dma timeout error: status=0x80 { Busy }
hdd: possibly failed opcode: 0x25
hdc: DMA disabled
hdd: DMA disabled
ide1: reset: success

also after this reset I just enabled DMA, the machine is still running, no
reset necessary. What would libata do?

any ideas? i am really desperate in the meanwhile. :'-(

thanks!
Alois
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux