Re: [bug report] lockdep WARN at PCI device rescan

Shinichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> · Fri, 1 Dec 2023 10:46:19 +0000

On Nov 30, 2023 / 10:36, Lukas Wunner wrote:
> On Thu, Nov 30, 2023 at 07:30:56AM +0000, Shinichiro Kawasaki wrote:
> > On Nov 29, 2023 / 15:53, Andy Shevchenko wrote:
> > > On Wed, Nov 29, 2023 at 03:50:21PM +0200, Andy Shevchenko wrote:
> > > > On Wed, Nov 29, 2023 at 12:17:39PM +0100, Lukas Wunner wrote:
> > > > > Your patch uses a list to store a multitude of struct resource.
> > > > > Is that actually necessary?  I thought there can only be a single
> > > > > P2SB device in the system?
> > 
> > Yes, the list might be too much. I was not sure what is the expected
> > number of P2SB resources to be cached. I found drivers/mfd/lpc_ich.c
> > calls p2sb_bar() at two places for devfn=0 and devfn=(13,2), so at
> > least two resources look required. Not sure about the future.
> > If two static resources are sufficient, the code will be simpler.
> 
> About that p2sb_bar() call in lpc_ich.c for PCI_DEVFN(13, 2):
> 
> It's in a switch-case statement for INTEL_SPI_BXT.  BXT means Broxton,
> which is an Atom Goldmont based architecture.
> 
> If you look at p2sb_cpu_ids[], you'll notice the P2SB is located at
> PCI_DEVFN(13, 0) on Goldmont.
> 
> PCI functions with function number > 0 are not enumerable unless there is
> a PCI function with function number 0.
> 
> So p2sb_bar() first unhides the P2SB at PCI_DEVFN(13, 0), then the
> SPI function at PCI_DEVFN(13, 2) becomes enumerable and p2sb_bar()
> retrieves the BAR address of that function.
> 
> Unfortunately this is a little byzantine.

Thanks for the explanation. This helped me to understand what Andy wrote in
antoher e-mail.

> 
> For the caching approach I guess it means you can assume there is only
> a single P2SB device in the system but you need to cache not just the
> P2SB BAR, but also the BARs of functions 1 .. 7 of the P2SB device
> if the P2SB's function number is 0.  I don't know if each of those
> upper functions only ever has a single BAR, but probably that's the case.

I see. I think this is consistent with Andy's explanation.

> 
> 
> > Lukas, thank you for the idea. If I understand the comment by Andy
> > correctly, P2SB should not be unhidden between arch_initcall and
> > fs_initcall. Hmm.
> > 
> > This made me think: how about to unhide and hide P2SB just during
> > fs_initcall to cache the P2SB resources? To try it, I added a function
> > below on top of the previous trial patch. The added function calls
> > p2sb_bar() for devfn=0 at fs_initcall so that the resource is cached
> > before probe of i2c-i801. This worked
> > good on my system. It avoided the deadlock as well as the lockdep WARN :)
> 
> This may work if i2c-i801 is compiled as a module, but may not work
> if it's builtin:  It would try to access the cached P2SB BAR when
> it's not yet been cached.  So you'd have to return -EPROBE_DEFER
> from p2sb_bar() if it hasn't been cached yet.  And you'd have to
> make sure that all the callers can cope with that return value.

I built i2c-i801 as a built-in module, and observed that my trial patch worked
good. IIUC, i2c-i801 probe happens after fs_initcall for p2sb resource caching,
even when i2c-i801 is built-in.

> 
> Another approach would be to cache the BARs very early at boot in
> arch/x86/kernel/early-quirks.c.  That would obviate the need to
> defer probing if the BAR hasn't been cached yet.
> 
> Looking through past discussions archived in lore, I've found an
> important issue raised by Bjorn:
> 
>    "Apparently this normally hidden device is consuming
>     PCI address space.  The PCI core needs to know about this.  If it
>     doesn't, the PCI core may assign this space to another device."
>     
>    https://lore.kernel.org/all/20210308185212.GA1790506@bjorn-Precision-5520/
> 
> arch/x86/kernel/early-quirks.c already reserves "stolen" memory used
> by Intel GPUs with unified-memory architecture.  It adjusts the e820
> map to achieve that.  I guess the same method could be used to reserve
> the memory used by P2SB (as well as "upper" functions if P2SB has
> function number 0).
> 
> An early version of p2sb_bar() (which wasn't mainlined) duplicated
> __pci_read_base().  I suggested to instead unhide and temporarily
> enumerate the device, retrieve the BAR, then destroy the pci_dev
> and hide the P2SB again:
> 
> https://lore.kernel.org/all/20220505145503.GA25423@xxxxxxxxx/
> 
> That resulted in a significant reduction in LoC.  But if the BAR
> caching happens in arch/x86/kernel/early-quirks.c, it may be
> necessary to duplicate at least portions of __pci_read_base() again.
> Or maybe the code can be simplified for this specific use case,
> I don't know.

Thanks. It's good to understand the past discussion (Wow, so big discussion was
held for p2sb.c introduction...). As I noted above, p2sb resource cache at
fs_initcall looks working good. I hope this fs_initcall approach is good enough
since it will be smaller patch than the resource cache at x86 early-quirks.