Re: [PATCH] pci/aspm: Fix null pointer dereference during re-enumeration of PCI bus in pcie_aspm_init_link_state even though ASPM is off

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 03, 2017 at 09:04:36AM +0100, Jan Rüth wrote:
> This patch fixes a null pointer dereference during PCI bus enumeration
> when ASPM is off. On an IBM x3850 8664 this bug causes the kernel to
> halt, this behavior did not appear in 3.10, so this is a regression.
> 
> pcie_aspm_sanity_check should only be called if ASPM is on.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=187731

> Signed-off-by: Jan Rueth <rueth@xxxxxxxxxxxxxxxxxxxxx>
> ---
>  drivers/pci/pcie/aspm.c | +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index f981129..e758b56 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -552,11 +552,12 @@ static struct pcie_link_state
> *alloc_pcie_link_state(struct pci_dev *pdev)
>  void pcie_aspm_init_link_state(struct pci_dev *pdev)
>  {
>         struct pcie_link_state *link;
> -       int blacklist = !!pcie_aspm_sanity_check(pdev);
> -
> +       int blacklist;
>         if (!aspm_support_enabled)
>                 return;
> 
> +       blacklist = !!pcie_aspm_sanity_check(pdev);
> +
>         if (pdev->link_state)
>                 return;

For some reason this doesn't apply for me (complains about a malformed
patch), but I can apply it by hand easily enough.

However, this path is a little obtuse, and I'd like to understand
what's going on better.  We currently have:

  pci_scan_slot
    if (bus->self && nr)
      pcie_aspm_init_link_state(bus->self)
        pcie_aspm_sanity_check
	  list_for_each_entry(child, &pdev->subordinate->devices, ...)

I assume the null pointer is pdev->subordinate, so maybe we're calling
pcie_aspm_init_link_state() for a bridge where we weren't able to
create a child bus ("pdev->subordinate")?

I think it's legal to have a bridge device where we haven't set
pdev->subordinate, but it does seem unusual.

I don't see a log with the actual null pointer dereference in the
bugzilla.  Do you have any more details about which device blows up
here and why it's unusual?

Part of the reason I'm curious is that I'd like to identify the commit
that introduced the problem so we can figure out where the fix might
need to be backported.  I suspected b7206cbf024d ("PCI ASPM: support
partial aspm enablement") because it moved the
pcie_aspm_sanity_check() call before the checks for aspm_disabled, but
that's been around since 2.6 days, and the problem you're seeing
didn't happen until after 3.10.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux