On Fri, 23 Apr 2004, Ralf Baechle wrote: > > > success report for the MC Bus Error handler :) > > > > > > Apr 19 23:17:32 resume kernel: MC Bus Error > > > Apr 19 23:17:32 resume kernel: CPU error 0x380<RD PAR > @ 0x0f4c6308 > > > Apr 19 23:17:32 resume kernel: Instruction bus error, epc == 2accf310, ra == 2accf2c8 > > > > > > I guess i have bad memory. The interesting point is that the machine > > > continued to run for another 2 days. Shouldnt a memory error halt the > > > machine ? > > > > As it happened in the user mode, I'd expect only the victim process to be > > killed. > > The KSU bits are meaningless. On Indy like most other MIPS systems a > bus error exception may be delayed. So the generic solution requires I beg your pardon? AFAIK, bus errors are documented to be reported precisely and my past experience with the systems I use confirms this. Otherwise bits in <asm/paccess.h> wouldn't work, but they do. Of course this is true for errors happening on read transactions (I have troubles imagining a delayed read), but the semantics of the exception is defined only for reads anyway. For other transactions a general-purpose interrupt should be used (and normally is). Such an interrupt can happen any time, indeed (but here it was an IBE, not an interrupt). > tracking down the actual user, something which in the current kernel is > relativly easy due to rmap. Well, that may be tough anyway -- imagine an uncorrectable memory error on a DMA transaction. ;-) -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--------------------------------------------------------------+ + e-mail: macro@xxxxxxxxxxxxx, PGP key available +