Re: [bug report] lockdep WARN at PCI device rescan

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Nov 29, 2023 / 15:53, Andy Shevchenko wrote:
> On Wed, Nov 29, 2023 at 03:50:21PM +0200, Andy Shevchenko wrote:
> > On Wed, Nov 29, 2023 at 12:17:39PM +0100, Lukas Wunner wrote:
> > > On Tue, Nov 28, 2023 at 07:45:06AM +0000, Shinichiro Kawasaki wrote:
> > > > On Nov 24, 2023 / 17:22, Andy Shevchenko wrote:
> 
> ...
> 
> > > > > Another possible solution I was thinking about is to have a local cache,
> > > > > so, make p2sb.c to be called just after PCI enumeration at boot time
> > > > > to cache the P2SB device's bar, and then cache the bar based on the device
> > > > > in question at the first call. Yet it may not remove all theoretical /
> > > > > possible scenarios with dead lock (taking into account hotpluggable
> > > > > devices), but won't fail the i801 driver in the above use case IIUC.
> > > > 
> > > > Thanks for the idea. I created an experimental patch below (it does not guard
> > > > list nor free the list elements, so it is incomplete). I confirmed that this
> > > > patch avoids the deadlock. So your idea looks working. I still observe the
> > > > deadlock WARN, but it looks better than the hang by the deadlock.
> > > 
> > > Your patch uses a list to store a multitude of struct resource.
> > > Is that actually necessary?  I thought there can only be a single
> > > P2SB device in the system?

Yes, the list might be too much. I was not sure what is the expected number of
P2SB resources to be cached. I found drivers/mfd/lpc_ich.c calls p2sb_bar() at
two places for devfn=0 and devfn=(13,2), so at least two resources look
required. Not sure about the future. If two static resources are sufficient, the
code will be simpler.

> > > 
> > > > Having said that, Heiner says in another mail that "A solution has to support
> > > > pci drivers using p2sb_bar() in probe()". This idea does not fulfill it. Hmm.
> > > 
> > > Basically what you need to do is create two initcalls:
> > > 
> > > Add one arch_initcall to unhide the P2SB device.
> > > 
> > > The P2SB subsequently gets enumerated by the PCI core in a subsys_initcall.
> > > 
> > > Then add an fs_initcall which extracts and stashes the struct resource,
> > > hides the P2SB device and destroys the corresponding pci_dev.
> > > 
> > > Then you don't need to acquire any locks at runtime, just retrieve the
> > > stashed struct resource.
> > > 
> > > This approach will result in the P2SB device briefly being enumerated
> > > and a driver could in theory bind to it.  Andy, is this a problem?
> > > I'm not seeing any drivers in the tree which bind to 8086/c5c5.
> > 
> > At least one problem just out of my head. The P2SB on many system is PCI
> > function 0. Unhiding the P2SB unhides all functions on that device, and
> > we have use cases for those (that's why we have two first parameters to
> > p2sb_bar() in case we want non-default device to be looked at).
> 
> For the clarity this is true for ATOM_GOLDMONT (see p2sb_cpu_ids array).

Lukas, thank you for the idea. If I understand the comment by Andy correctly,
P2SB should not be unhidden between arch_initcall and fs_initcall. Hmm.

This made me think: how about to unhide and hide P2SB just during fs_initcall
to cache the P2SB resources? To try it, I added a function below on top of the
previous trial patch. The added function calls p2sb_bar() for devfn=0 at
fs_initcall so that the resource is cached before probe of i2c-i801. This worked
good on my system. It avoided the deadlock as well as the lockdep WARN :)

static int __init p2sb_fs_init(void)
{
	struct pci_bus *bus;
	struct resource mem;
	int ret = 0;

	bus = pci_find_bus(0, 0);
	if (bus) {
		ret = p2sb_bar(bus, 0, &mem);
		if (ret)
			pr_err("p2sb_bar failed: %d", ret);
	}
	return 0;
}
fs_initcall(p2sb_fs_init);

The result of the trial is encouraging, but I'm not yet sure if this idea is
really feasible. I have three questions in my mind:

- The trial function above assumed the P2SB device is at the PCI bus number=0
  and domain=0. It is ok on my system, but is it valid always? I see this is
  valid at least for drivers/edac/pdn2_edac.c and
  drivers/watchdog/simatic-ipc-wdt.c, but not sure for drivers/mfd/lpc_ich.c
  and drivers/i2c/busses/i2c-i801.

- The trial function above only caches the resource for devfn=0. This is not
  enough for drivers/mfd/lpc_ich.c. Another resource for devfn=(13,2) should be
  cached. It does not look good to hardcode these devfns and cache them always.
  It looks required to communicate devfn to cache from p2sb_bar() caller drivers
  to p2sb. How can we do it?

- Does this work when suspend-resume happens?

Comments on the questions will be appreciated.




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux