On Wed, Aug 21, 2019 at 04:15:13PM -0700, Rajat Jain wrote: > Split the AER stats into multiple sysfs atributes. Note that > this changes the ABI of the AER stats, but hopefully, there > aren't active users that need to change. This is how the AERs > are being exposed now: > > localhost /sys/devices/pci0000:00/0000:00:1c.0/aer_stats # ls -l > total 0 > -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit0_RxErr > -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit12_Timeout > -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit13_NonFatalErr > -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit14_CorrIntErr > -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit15_HeaderOF > -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit6_BadTLP > -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit7_BadDLLP > -r--r--r--. 1 root root 4096 Aug 20 16:35 correctable_bit8_Rollover > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit0_Undefined > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit12_TLP > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit13_FCP > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit14_CmpltTO > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit15_CmpltAbrt > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit16_UnxCmplt > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit17_RxOF > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit18_MalfTLP > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit19_ECRC > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit20_UnsupReq > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit21_ACSViol > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit22_UncorrIntErr > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit23_BlockedTLP > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit24_AtomicOpBlocked > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit25_TLPBlockedErr > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit26_PoisonTLPBlocked > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit4_DLP > -r--r--r--. 1 root root 4096 Aug 20 16:35 fatal_bit5_SDES > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit0_Undefined > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit12_TLP > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit13_FCP > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit14_CmpltTO > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit15_CmpltAbrt > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit16_UnxCmplt > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit17_RxOF > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit18_MalfTLP > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit19_ECRC > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit20_UnsupReq > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit21_ACSViol > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit22_UncorrIntErr > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit23_BlockedTLP > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit24_AtomicOpBlocked > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit25_TLPBlockedErr > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit26_PoisonTLPBlocked > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit4_DLP > -r--r--r--. 1 root root 4096 Aug 20 16:35 nonfatal_bit5_SDES > -r--r--r--. 1 root root 4096 Aug 20 16:35 total_device_err_cor > -r--r--r--. 1 root root 4096 Aug 20 16:35 total_device_err_fatal > -r--r--r--. 1 root root 4096 Aug 20 16:35 total_device_err_nonfatal > -r--r--r--. 1 root root 4096 Aug 20 16:35 total_rootport_err_cor > -r--r--r--. 1 root root 4096 Aug 20 16:35 total_rootport_err_fatal > -r--r--r--. 1 root root 4096 Aug 20 16:35 total_rootport_err_nonfatal > localhost /sys/devices/pci0000:00/0000:00:1c.0/aer_stats # > > Each file is has a single counter value. Single file containing all > stats was frowned upon and discussed here: > https://lkml.org/lkml/2019/6/28/220 > > Signed-off-by: Rajat Jain <rajatja@xxxxxxxxxx> > --- > I personally think that this makes it a little overwhelming for a human, > e.g. I could look at total but don't exactly know while file to look at > next in order to drill down. But I couldn't think of any other way. Some > problems I'd have liked to fix but they require deeper surgery: > > * Now each PCI device sysfs node will have a sub-directory called aer_stats. > (The subdirectory will have attributes only if it supports AER, but > the sub directory will always be present). > > * This patch isn't re-using the strings array like it was using earlier. > I thought of adding the attribute group at run time, so it will take > care of both the problems, but can only do that after device_add() call, > I think. > > If we are comfortable introducing a call to a new function > pci_aer_stats_init() after call to device_add() in in pci_device_add(), > the above problems can be fixed. > > drivers/pci/pcie/aer.c | 166 +++++++++++++++++++++++++++++------------ > 1 file changed, 119 insertions(+), 47 deletions(-) You need a Documentation/ABI/ update for the new sysfs files before we can properly review this to see if you are doing what you think you are doing :) thanks, greg k-h