On Mon, Dec 02, 2013 at 01:05:16PM +0800, rui wang wrote: > > + TP_printk("%s PCIe Bus Error: severity=%s, %s\n", > > + __get_str(dev_name), > > + __entry->severity == HW_EVENT_ERR_CORRECTED ? "Corrected" : > > + __entry->severity == HW_EVENT_ERR_FATAL ? > > + "Fatal" : "Uncorrected", > > + __entry->severity == HW_EVENT_ERR_CORRECTED ? > > + __print_flags(__entry->status, "|", aer_correctable_errors) : > > + __print_flags(__entry->status, "|", aer_uncorrectable_errors)) > > +); > > This causes inconsistency between dmesg and the trace event output. > When dmesg says "severity=Corrected", the trace event says > "severity=Fatal". What happens is that HW_EVENT_ERR_CORRECTED is > defined in edac.h: > > enum hw_event_mc_err_type { > HW_EVENT_ERR_CORRECTED, > HW_EVENT_ERR_UNCORRECTED, > HW_EVENT_ERR_FATAL, > HW_EVENT_ERR_INFO, > }; > > while aer_print_error() uses aer_error_severity_string[] defined as: > > static const char *aer_error_severity_string[] = { > "Uncorrected (Non-Fatal)", > "Uncorrected (Fatal)", > "Corrected" > }; > > In this case dmesg is correct because info->severity is assigned in > aer_isr_one_error() using the definitions in include/linux/ras.h: > #define AER_NONFATAL 0 > #define AER_FATAL 1 > #define AER_CORRECTABLE 2 > > So which one is the standard? Is there a plan to unify all these names? Yes, the AER tracepoint above should use the AER_* defines and not the HW_EVENT_ERR_* ones which are for memory errors. Wanna send a fix? Thanks. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html