On Fri, 30 Nov 2018 05:29:47 +0000 Bharat Bhushan <bharat.bhushan@xxxxxxx> wrote: > Hi, > > > -----Original Message----- > > From: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > > Sent: Thursday, November 29, 2018 1:46 AM > > To: Bharat Bhushan <bharat.bhushan@xxxxxxx> > > Cc: alex.williamson@xxxxxxxxxx; Bjorn Helgaas <helgaas@xxxxxxxxxx>; linux- > > pci@xxxxxxxxxxxxxxx; Linux Kernel Mailing List <linux- > > kernel@xxxxxxxxxxxxxxx>; bharatb.yadav@xxxxxxxxx; David Daney > > <david.daney@xxxxxxxxxx>; jglauber@xxxxxxxxxx; > > mbroemme@xxxxxxxxxx; chrisrblake93@xxxxxxxxx > > Subject: Re: [PATCH] PCI: Mark NXP LS1088 to avoid bus reset bus > > > > On Tue, Nov 27, 2018 at 10:32 PM Bharat Bhushan > > <bharat.bhushan@xxxxxxx> wrote: > > > > > > -----Original Message----- > > > > From: Alex Williamson <alex.williamson@xxxxxxxxxx> > > > > Sent: Tuesday, November 27, 2018 9:39 PM > > > > To: Bjorn Helgaas <helgaas@xxxxxxxxxx> > > > > Cc: Bharat Bhushan <bharat.bhushan@xxxxxxx>; > > > > linux-pci@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > > > > bharatb.yadav@xxxxxxxxx; David Daney <david.daney@xxxxxxxxxx>; > > Jan > > > > Glauber <jglauber@xxxxxxxxxx>; Maik Broemme > > <mbroemme@xxxxxxxxxx>; > > > > Chris Blake <chrisrblake93@xxxxxxxxx> > > > > Subject: Re: [PATCH] PCI: Mark NXP LS1088 to avoid bus reset bus > > > > > > > > On Tue, 27 Nov 2018 09:33:56 -0600 > > > > Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > > > > > 4) Is there a hardware erratum for this? If so, please include > > > > > the URL here. > > > > > > No h/w errata as of now. > > > > Does that mean (a) the HW folks agree this is a hardware problem but they > > haven't written an erratum, (b) there is an erratum but it isn't public, (c) we > > don't have any concrete evidence of a hardware problem, but things just > > don't work if we do a bus reset, (d) something else? > > I will say it is (c) - not concluded to be hardware h/w issue. > > > > > > In pci_reset_secondary_bus() I have tried to increase the delay after reset > > but not helped. > > > Do I need to add delay at some other place as well? > > > > No, I think the place you tried should be enough. > > > > You should also be able to exercise this from user-space by using "setpci" to > > set and clear the Secondary Bus Reset bit in the Bridge Control register. Then > > you can also use setpci to read/write config space of the NIC. The kernel > > would normally read the Vendor and Device IDs as the first access to the > > device during enumeration. You also might be able to learn something by > > using "lspci -vv" on the bridge before and after the reset to see if it logs any > > AER bits (if it supports AER) or the other standard error logging bits. > > I tried below sequence for Secondary bus reset and device config space show 0xff > > root@localhost:~# lspci -x > 0002:00:00.0 PCI bridge: Freescale Semiconductor Inc Device 80c0 (rev 10) > 00: 57 19 c0 80 07 01 10 00 10 00 04 06 08 00 01 00 > 10: 00 00 00 00 00 00 00 00 00 01 ff 00 01 01 00 00 > 20: 00 40 00 40 f1 ff 01 00 00 00 00 00 00 00 00 00 > 30: 00 00 00 00 40 00 00 00 00 00 00 40 63 01 00 00 > > 0002:01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection > 00: 86 80 d3 10 06 04 10 00 00 00 00 02 10 00 00 00 > 10: 00 00 0c 40 00 00 00 40 01 00 00 00 00 00 0e 40 > 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 1f a0 > 30: 00 00 24 40 c8 00 00 00 00 00 00 00 63 01 00 00 > > root@localhost:~# setpci -s 0002:00:00.0 0x3e.b=0x40 > root@localhost:~# setpci -s 0002:00:00.0 0x3e.b=0x00 > > root@localhost:~# lspci -x > 0002:00:00.0 PCI bridge: Freescale Semiconductor Inc Device 80c0 (rev 10) > 00: 57 19 c0 80 07 01 10 00 10 00 04 06 08 00 01 00 > 10: 00 00 00 00 00 00 00 00 00 01 ff 00 01 01 00 00 > 20: 00 40 00 40 f1 ff 01 00 00 00 00 00 00 00 00 00 > 30: 00 00 00 00 40 00 00 00 00 00 00 40 63 01 00 00 Just for curiosity sake, what if you re-write the secondary and subordinate bus registers here: # setpci -s 0002:00:00.0 0x19.b=0x01 # setpci -s 0002:00:00.0 0x1a.b=0xff IIRC the users that debugged the AMD bus reset issue re-wrote the entire 64 bytes of the bridge config header and then further narrowed the issue down to the two registers above. If one bridge implementation can have such an issue, maybe others do too. Perhaps there's common IP in use. Are you able to test other endpoints besides this e1000e device with this setpci technique? Thanks, Alex > 0002:01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection (rev ff) > 00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > 10: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > 20: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > 30: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > > Thanks > -Bharat > >