Re: [PATCH] New way of storing MCA/INIT logs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 11, 2008 at 03:07:20PM +0100, Zoltan Menyhart wrote:
> Let me ask again: do you expect _independent_ MCAs to happen?
> If you have got a estimation of the probability of independent
> MCAs happening at a same time, different from what I calculated,
> then please share it with us.
>
> If the MCAs are the consequences of the same error event, then
> you can find out what they are, where they are from 2 or 3 logs.
>
> The code actual tries to recover local MCAs only. They are:
> - TLB errors: per CPU local. As the CPUs are much more reliable
>  then the other components, e.g. the memory, having two or
>  more CPUs with corrupted TLBs at the same time is really unlikely.
> - I/O or memory read errors:
>  + One error has affected N CPUs: the first log is enough.
>  + More than one independent error at the same time: assuming
>    my estimations are more or less correct...

I don't know enough in this area to be of much use, but I do recall
times where a customer machine has run into an error and the neither the
first nor last record was of any use, but one of the intermediate
records.  I recall taking nearly a day to find the critical difference
and I vaguely recall it was on the order of 120 records and the useful
record was in the early 80s.  Russ certainly has more experience in this
area.

Thanks,
Robin
--
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel]     [Sparc Linux]     [DCCP]     [Linux ARM]     [Yosemite News]     [Linux SCSI]     [Linux x86_64]     [Linux for Ham Radio]

  Powered by Linux