Hello Robert,
at this time I have not tried other UDMA-modes - the controller is
udma133 able, the flashcard is udma66-able and the harddisc is (limited
by the 44pin cable) udma44-able. with legacy ATA I also use UDMA66 /
UDMA44-modes.
but perhaps this logs are another step in the right direction: my last
libata-crash looked like this:
ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
ata2.01: BMDMA stat 0x65
ata2.01: failed command: READ DMA
ata2.01: cmd c8/00:08:9f:03:41/00:00:00:00:00/f6 tag 0 dma 4096 in
res 00/00:08:9f:03:41/00:00:00:00:00/f6 Emask 0x2 (HSM violation)
ata2: soft resetting link
ata2.00: FORCE: xfer_mask set to udma4
ata2.01: FORCE: xfer_mask set to udma3
ata2.00: configured for UDMA/66
ata2.01: configured for UDMA/44
ata2.00: FORCE: xfer_mask set to udma4
ata2.01: FORCE: xfer_mask set to udma3
ata2.00: configured for UDMA/66
ata2.01: configured for UDMA/44
ata2: EH complete
until yet I have legacy-ata-logs found the look like the same - or at
least crap:
hdd: ide_dma_sff_timer_expiry: DMA status (0x61)
hdd: dma_intr: status=0x7f { DriveReady DeviceFault SeekComplete
DataRequest CorrectedError Index Error }
hdd: dma_intr: error=0x7f { DriveStatusError UncorrectableError
SectorIdNotFound TrackZeroNotFound AddrMarkNotFound },
LBAsect=8830587504648, sector=209806663
hdd: possibly failed opcode: 0x25
hdc: DMA disabled
hdd: DMA disabled
ide1: reset: success
the logged LBAsect and also the logged sector are not existent on this
drive - but they are neither a harddisc-failure not a filesystem-failure
- this must be an ugly bug in this chipset or maybe just a
communication-problem between controller and harddisc. I have already
changed cabeling, harddisc (three times in the meanwhile! On my actual
drive I did already with dd a complete write - there were neither logged
from kernel bad sectors nor smart does show any pending sectors or
reallocated sectors - the harddisc has no problem), compact flash (also
three times, already another manufacturer - the flash is currently two
month old, I will not belive that it is damaged, altough i did only
read-only tests), memory (altough I've tested it several times with
memtest) - if there is a hardware-failure it can only be the
IDE-controller which I cannot check.
my idea: libata is not able to handle this issue in a way
legacy-ide-driver did - as logged the channel got reset, both drives are
from now on in PIO-mode, but i can manually set them to DMA again and it
works "as good" as before. with libata I am sure this were another
reset-reason. Libata seems to force always UDMA-mode after the reset -
is there a possibility to workaround?
genereally the DMA-behaviour is from legacy-IDE much better in my
opinion: it's possible to set with hdparm in userspace the DMA-mode.
libata des not offer such a possibility, does it? So I have no
possibility to control or change the behaviour after boot, I have to
hope that the fallback-mechanism is good enough...
also I saw "harmless" IDE-communication-problems:
hdd: ide_dma_sff_timer_expiry: DMA status (0x61)
hdd: DMA timeout error
hdd: dma timeout error: status=0x80 { Busy }
hdd: possibly failed opcode: 0x25
hdc: DMA disabled
hdd: DMA disabled
ide1: reset: success
also after this reset I just enabled DMA, the machine is still running,
no reset necessary. What would libata do?
any ideas? i am really desperate in the meanwhile. :'-(
thanks!
Alois
Robert Hancock wrote:
On 06/10/2010 01:52 PM, MadLoisae@xxxxxxx wrote:
Hi there,
actually I am using kernel 2.6.34, up to now I was in every (stable)
release since 2.6.30 affected by this issue.
Today I have reactivated legacy, regrettably deprecated parallel ATA
support and have disabled libata. Its a shame, libata is much faster
(about 20% faster I/O measureable) and more forward-looking, but it is
not a real alternative if it crashes continuous but not reproduceable on
via-chipsets (google for this, the web is filled of this issue!). I
know, via chipsets are not very good, but shouldn't we try to make it
better (or at least best as possible) with newer drivers instead of
worse?
I hope legacy ATA support won't be removed soon from the kernel
sources ...
I like trying out a lot, but if the response is so thin it does not make
fun just looking at the same issue with the same messages again and
again not able to do anything beside looking at it and resetting the box
afterwards ...
Have you tried limiting the speed to UDMA2? If that helps then it
could be that the motherboard circuitry, etc. isn't suitable for
faster speeds.
Random timeouts are unfortunately quite hard to debug since there's so
many problems that can cause them but the symptoms are the same: could
be that there was an error on the bus that caused something to stall,
an interrupt got lost somehow, etc. Or maybe the timing of device
access is somehow different and thus more likely to trigger whatever
the cause is. There also seem to be a fair number of bugs in these IDE
chipsets that the driver has to work around, could be there is one
missing in the libata version..
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html