> -----Original Message----- > From: Intel-wired-lan <intel-wired-lan-bounces@xxxxxxxxxx> On Behalf Of Jesse Brandeburg > Sent: Wednesday, October 4, 2023 10:32 PM > To: intel-wired-lan@xxxxxxxxxxxxxxxx > Cc: pmenzel@xxxxxxxxxxxxx; Vishal Agrawal <vagrawal@xxxxxxxxxx>; linux-pci@xxxxxxxxxxxxxxx; Brandeburg, Jesse <jesse.brandeburg@xxxxxxxxx>; netdev@xxxxxxxxxxxxxxx; jkc@xxxxxxxxxx; Kitszel, Przemyslaw <przemyslaw.kitszel@xxxxxxxxx> > Subject: [Intel-wired-lan] [PATCH iwl-net v3] ice: reset first in crash dump kernels > > When the system boots into the crash dump kernel after a panic, the ice > networking device may still have pending transactions that can cause errors > or machine checks when the device is re-enabled. This can prevent the crash > dump kernel from loading the driver or collecting the crash data. > > To avoid this issue, perform a function level reset (FLR) on the ice device > via PCIe config space before enabling it on the crash kernel. This will > clear any outstanding transactions and stop all queues and interrupts. > Restore the config space after the FLR, otherwise it was found in testing > that the driver wouldn't load successfully. > > The following sequence causes the original issue: > - Load the ice driver with modprobe ice > - Enable SR-IOV with 2 VFs: echo 2 > /sys/class/net/eth0/device/sriov_num_vfs > - Trigger a crash with echo c > /proc/sysrq-trigger > - Load the ice driver again (or let it load automatically) with modprobe ice > - The system crashes again during pcim_enable_device() > > Fixes: 837f08fdecbe ("ice: Add basic driver framework for Intel(R) E800 Series") > > Reported-by: Vishal Agrawal <vagrawal@xxxxxxxxxx> > Reviewed-by: Jay Vosburgh <jay.vosburgh@xxxxxxxxxxxxx> > Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@xxxxxxxxx> > Signed-off-by: Jesse Brandeburg <jesse.brandeburg@xxxxxxxxx> > --- > v3: add Fixes tag as approximate, added Jay's RB tag > v2: respond to list comments and update commit message > v1: initial version > --- > drivers/net/ethernet/intel/ice/ice_main.c | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) > Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@xxxxxxxxx> (A Contingent worker at Intel)