On Fri, Oct 11, 2013 at 02:32:38AM -0400, Chen, Gong wrote: > [56005.785917] {3}Hardware error detected on CPU0 > [56005.785959] {3}event severity: corrected > [56005.785975] {3}sub_event[0], severity: corrected > [56005.785977] {3}section_type: memory error > [56005.785981] {3}physical_address: 0x0000000851fe0000 > [56005.786027] {3}DIMM location: Memriser1 CHANNEL A DIMM 0 Very good guys, I've been waiting for years for this to be possible, good job! :-) Btw, what's "Memriser1"? > [56005.786154] {4}Hardware error detected on CPU0 > [56005.786159] {4}event severity: corrected > [56005.786162] {4}sub_event[0], severity: corrected This sub_event[0] could use better decoding though. > [56005.786166] {4}section_type: memory error > > > trace output: > > # tracer: nop > # > # entries-in-buffer/entries-written: 4/4 #P:120 > # > # _-----=> irqs-off > # / _----=> need-resched > # | / _---=> hardirq/softirq > # || / _--=> preempt-depth > # ||| / delay > # TASK-PID CPU# |||| TIMESTAMP FUNCTION > # | | | |||| | | > ... > ... > <idle>-0 [000] d.h. 56068.488759: extlog_mem_event: 3 corrected errors:unknown That "unknown" thing needs a " " in front of it and comes from cper_mem_err_type_str, AFAICT. I'm guessing the value is 0 and uninitialized or so? > on Memriser1 CHANNEL A DIMM 0(FRU: Also another " " missing here. > 00000000-0000-0000-0000-000000000000 physical addr: 0x0000000851fe0000 node: 0 card: 0 module: 0 rank: 0 bank: 0 row: 28927 column: 1296) > <idle>-0 [000] d.h. 56068.488834: extlog_mem_event: 4 corrected errors:unknown > ... > ... > > dmesg output are shrank to only keep the most important data. The trace > output will contain most of data. Not sure if all fields are meaningful > to users. Some fields like FRU ID/FRU TEXT depends on BIOS manufactor. > So welcome to add comments for what is needed or not. Yeah, I guess we again depend on BIOS people to fill those in. I'd expect serious server manifacturers who care about RAS to do so... Thanks. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html