Re: [PATCH] pci-driver: Add driver load messages

Bjorn Helgaas <helgaas@xxxxxxxxxx> · Thu, 4 Mar 2021 09:50:40 -0600

On Thu, Mar 04, 2021 at 09:42:44AM -0500, Prarit Bhargava wrote:
> 
> 
> On 2/18/21 2:06 PM, Bjorn Helgaas wrote:
> > On Thu, Feb 18, 2021 at 01:36:35PM -0500, Prarit Bhargava wrote:
> >> On 1/26/21 10:12 AM, Bjorn Helgaas wrote:
> >>> On Tue, Jan 26, 2021 at 09:05:23AM -0500, Prarit Bhargava wrote:
> >>>> On 1/26/21 8:53 AM, Leon Romanovsky wrote:
> >>>>> On Tue, Jan 26, 2021 at 08:42:12AM -0500, Prarit Bhargava wrote:
> >>>>>> On 1/26/21 8:14 AM, Leon Romanovsky wrote:
> >>>>>>> On Tue, Jan 26, 2021 at 07:54:46AM -0500, Prarit Bhargava wrote:
> >>>>>>>>   Leon Romanovsky <leon@xxxxxxxxxx> wrote:
> >>>>>>>>> On Mon, Jan 25, 2021 at 02:41:38PM -0500, Prarit Bhargava wrote:
> >>>>>>>>>> There are two situations where driver load messages are helpful.
> >>>>>>>>>>
> >>>>>>>>>> 1) Some drivers silently load on devices and debugging driver or system
> >>>>>>>>>> failures in these cases is difficult.  While some drivers (networking
> >>>>>>>>>> for example) may not completely initialize when the PCI driver probe() function
> >>>>>>>>>> has returned, it is still useful to have some idea of driver completion.
> >>>>>>>>>
> >>>>>>>>> Sorry, probably it is me, but I don't understand this use case.
> >>>>>>>>> Are you adding global to whole kernel command line boot argument to debug
> >>>>>>>>> what and when?
> >>>>>>>>>
> >>>>>>>>> During boot:
> >>>>>>>>> If device success, you will see it in /sys/bus/pci/[drivers|devices]/*.
> >>>>>>>>> If device fails, you should get an error from that device (fix the
> >>>>>>>>> device to return an error), or something immediately won't work and
> >>>>>>>>> you won't see it in sysfs.
> >>>>>>>>
> >>>>>>>> What if there is a panic during boot?  There's no way to get to sysfs.
> >>>>>>>> That's the case where this is helpful.
> >>>>>>>
> >>>>>>> How? If you have kernel panic, it means you have much more worse problem
> >>>>>>> than not-supported device. If kernel panic was caused by the driver, you
> >>>>>>> will see call trace related to it. If kernel panic was caused by
> >>>>>>> something else, supported/not supported won't help here.
> >>>>>>
> >>>>>> I still have no idea *WHICH* device it was that the panic occurred on.
> >>>>>
> >>>>> The kernel panic is printed from the driver. There is one driver loaded
> >>>>> for all same PCI devices which are probed without relation to their
> >>>>> number.>
> >>>>> If you have host with ten same cards, you will see one driver and this
> >>>>> is where the problem and not in supported/not-supported device.
> >>>>
> >>>> That's true, but you can also have different cards loading the same driver.
> >>>> See, for example, any PCI_IDs list in a driver.
> >>>>
> >>>> For example,
> >>>>
> >>>> 10:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] (rev 02)
> >>>> 20:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3108 [Invader] (rev 02)
> >>>>
> >>>> Both load the megaraid driver and have different profiles within the
> >>>> driver.  I have no idea which one actually panicked until removing
> >>>> one card.
> >>>>
> >>>> It's MUCH worse when debugging new hardware and getting a panic
> >>>> from, for example, the uncore code which binds to a PCI mapped
> >>>> device.  One device might work and the next one doesn't.  And
> >>>> then you can multiply that by seeing *many* panics at once and
> >>>> trying to determine if the problem was on one specific socket,
> >>>> die, or core.
> >>>
> >>> Would a dev_panic() interface that identified the device and
> >>> driver help with this?
> >>
> >> ^^ the more I look at this problem, the more a dev_panic() that
> >> would output a device specific message at panic time is what I
> >> really need.
> 
> I went down this road a bit and had a realization.  The issue isn't
> with printing something at panic time, but the *data* that is
> output.  Each PCI device is associated with a struct device.  That
> device struct's name is output for dev_dbg(), etc., commands.  The
> PCI subsystem sets the device struct name at drivers/pci/probe.c:
> 1799
> 
> 	        dev_set_name(&dev->dev, "%04x:%02x:%02x.%d", pci_domain_nr(dev->bus),
>                      dev->bus->number, PCI_SLOT(dev->devfn),
>                      PCI_FUNC(dev->devfn));
> 
> My problem really is that the above information is insufficient when
> I (or a user) need to debug a system.  The complexities of debugging
> multiple broken driver loads would be much easier if I didn't have
> to constantly add this output manually :).

This *should* already be in the dmesg log:

  pci 0000:00:00.0: [8086:5910] type 00 class 0x060000
  pci 0000:00:01.0: [8086:1901] type 01 class 0x060400
  pci 0000:00:02.0: [8086:591b] type 00 class 0x030000

So if you had a dev_panic(), that message would include the
bus/device/function number, and that would be enough to find the
vendor/device ID from when the device was first enumerated.

Or are you saying you can't get the part of the dmesg log that
contains those vendor/device IDs?

> Would you be okay with adding a *debug* parameter to expand the
> device name to include the vendor & device ID pair?  FWIW, I'm
> somewhat against yet-another-kernel-option but that's really the
> information I need.  I could then add dev_dbg() statements in the
> local_pci_probe() function.