> -----Original Message----- > From: Iyer, Shyam > Sent: Thursday, May 17, 2012 1:52 PM > To: 'Prarit Bhargava' > Cc: 'linux-pci@xxxxxxxxxxxxxxx'; 'bhelgaas@xxxxxxxxxx' > Subject: RE: [PATCH] pci, Add AER_panic sysfs file > > > > > -----Original Message----- > > From: Prarit Bhargava [mailto:prarit@xxxxxxxxxx] > > Sent: Thursday, May 17, 2012 1:39 PM > > To: Iyer, Shyam > > Cc: linux-pci@xxxxxxxxxxxxxxx; bhelgaas@xxxxxxxxxx > > Subject: Re: [PATCH] pci, Add AER_panic sysfs file > > > > > > > > On 05/17/2012 01:29 PM, Shyam_Iyer@xxxxxxxx wrote: > > > > > > > > >> -----Original Message----- > > >> From: linux-pci-owner@xxxxxxxxxxxxxxx [mailto:linux-pci- > > >> owner@xxxxxxxxxxxxxxx] On Behalf Of Prarit Bhargava > > >> Sent: Thursday, May 17, 2012 1:05 PM > > >> To: linux-pci@xxxxxxxxxxxxxxx > > >> Cc: Prarit Bhargava; Bjorn Helgaas > > >> Subject: [PATCH] pci, Add AER_panic sysfs file > > >> > > >> Consider the following case > > >> > > >> [ RP ] > > >> | > > >> | > > >> +---------+-----------+ > > >> | | | > > >> [H1] [H2] [X1] > > >> > > >> where RP is a PCIE Root Port, H1 and H2 are devices with drivers > > that > > >> support > > >> PCIE AER driver error handling (ie, they have pci_error_handlers > > >> defined in > > >> the driver), and X1 is a device with a driver that does not > support > > >> PCIE > > >> AER driver error handling. > > >> > > >> If the Root Port takes an error what currently happens is that the > > >> bus resets and H1 & H2 call their slot_reset functions. X1 does > > >> nothing. > > >> > > >> In some cases a user may not wish the system to continue because > X1 > > is > > >> an unhardened driver. In these cases, the system should not do a > > bus > > >> reset, > > >> but rather the system should panic to avoid any further possible > > data > > >> corruption. > > > > > > Do we neeed to panic for both correctable and uncorrectable > errors.. > > ? > > > > > > I thought correctable errors could recover without a bus reset. > > > > Will a bus reset be issued on a correctable error? I thought the > code > > path was > > that the bus reset was issued on the uncorrectable error. > > > > drivers/pci/pcie/aer/aerdrv_core.c: do_recovery() > > > > if (severity == AER_FATAL) { > > result = reset_link(dev); > > if (result != PCI_ERS_RESULT_RECOVERED) > > goto failed; > > } > > > > I may not be looking at the right spot of code. Care to enlighten > me? > > :) > > > > P. > > Actually I was reading the documentation .. > Documentation/PCI/pcieaer-howto.txt > > " > Correctable errors pose no impacts on the functionality of the > interface. The PCI Express protocol can recover without any software > intervention or any loss of data. These errors are detected and > corrected by hardware. Unlike correctable errors, uncorrectable > errors impact functionality of the interface. Uncorrectable errors > can cause a particular transaction or a particular PCI Express link > to be unreliable. Depending on those error conditions, uncorrectable > errors are further classified into non-fatal errors and fatal errors. > Non-fatal errors cause the particular transaction to be unreliable, > but the PCI Express link itself is fully functional. Fatal errors, on > the other hand, cause the link to be unreliable. > " > > But anyways the AER_FATAL is true for uncorrectable errors only and not > for correctable errors which means reset_link doesn't happen for > correctable errors. > > drivers/pci/pcie/aer/aerdrv_core.c > > if (e_src->status & PCI_ERR_ROOT_UNCOR_RCV) { > e_info->id = ERR_UNCOR_ID(e_src->id); > > if (e_src->status & PCI_ERR_ROOT_FATAL_RCV) > e_info->severity = AER_FATAL; > else > e_info->severity = AER_NONFATAL; > > if (e_src->status & PCI_ERR_ROOT_MULTI_UNCOR_RCV) > e_info->multi_error_valid = 1; > else > e_info->multi_error_valid = 0; > > aer_print_port_info(p_device->port, e_info); > > if (find_source_device(p_device->port, e_info)) > aer_process_err_devices(p_device, e_info); > } Looks like we are saying the same thing and I just misunderstood that you were doing a panic for each error. The patch looks good to me if it matters. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html