Re: [PATCH] pci: fix I/O space page leak

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello!

On 6/28/2018 5:26 PM, Lorenzo Pieralisi wrote:

When testing the R-Car PCIe driver on the Condor board, I noticed that iff
I  left the PCIe PHY driver disabled, the kernel crashed  with this BUG:

[    1.225819] kernel BUG at lib/ioremap.c:72!
[    1.230007] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[    1.235496] Modules linked in:
[    1.238561] CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092
[    1.245526] Hardware name: Renesas Condor board based on r8a77980 (DT)
[    1.252075] Workqueue: events deferred_probe_work_func
[    1.257220] pstate: 80000005 (Nzcv daif -PAN -UAO)
[    1.262024] pc : ioremap_page_range+0x370/0x3c8
[    1.266558] lr : ioremap_page_range+0x40/0x3c8
[    1.271002] sp : ffff000008da39e0
[    1.274317] x29: ffff000008da39e0 x28: 00e8000000000f07
[    1.279636] x27: ffff7dfffee00000 x26: 0140000000000000
[    1.284954] x25: ffff7dfffef00000 x24: 00000000000fe100
[    1.290272] x23: ffff80007b906000 x22: ffff000008ab8000
[    1.295590] x21: ffff000008bb1d58 x20: ffff7dfffef00000
[    1.300909] x19: ffff800009c30fb8 x18: 0000000000000001
[    1.306226] x17: 00000000000152d0 x16: 00000000014012d0
[    1.311544] x15: 0000000000000000 x14: 0720072007200720
[    1.316862] x13: 0720072007200720 x12: 0720072007200720
[    1.322180] x11: 0720072007300730 x10: 00000000000000ae
[    1.327498] x9 : 0000000000000000 x8 : ffff7dffff000000
[    1.332816] x7 : 0000000000000000 x6 : 0000000000000100
[    1.338134] x5 : 0000000000000000 x4 : 000000007b906000
[    1.343452] x3 : ffff80007c61a880 x2 : ffff7dfffeefffff
[    1.348770] x1 : 0000000040000000 x0 : 00e80000fe100f07
[    1.354090] Process kworker/0:1 (pid: 39, stack limit = 0x        (ptrval))
[    1.361056] Call trace:
[    1.363504]  ioremap_page_range+0x370/0x3c8
[    1.367695]  pci_remap_iospace+0x7c/0xac
[    1.371624]  pci_parse_request_of_pci_ranges+0x13c/0x190
[    1.376945]  rcar_pcie_probe+0x4c/0xb04
[    1.380786]  platform_drv_probe+0x50/0xbc
[    1.384799]  driver_probe_device+0x21c/0x308
[    1.389072]  __device_attach_driver+0x98/0xc8
[    1.393431]  bus_for_each_drv+0x54/0x94
[    1.397269]  __device_attach+0xc4/0x12c
[    1.401107]  device_initial_probe+0x10/0x18
[    1.405292]  bus_probe_device+0x90/0x98
[    1.409130]  deferred_probe_work_func+0xb0/0x150
[    1.413756]  process_one_work+0x12c/0x29c
[    1.417768]  worker_thread+0x200/0x3fc
[    1.421522]  kthread+0x108/0x134
[    1.424755]  ret_from_fork+0x10/0x18
[    1.428334] Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000)

It turned out that pci_remap_iospace() wasn't undone when the driver's
probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER,
the probe was retried,  finally causing the BUG due to trying to remap
already remapped pages.

The most feasible solution seems to introduce devm_pci_remap_iospace()
and call it instead of pci_remap_iospace(), so that the pages get unmapped
automagically on any probe failure.

And  while fixing pci_parse_request_of_pci_ranges(), aslo fix the other
drivers that have probably copied the bad example...

Fixes: 4e64dbe226e7 ("PCI: generic: Expose pci_host_common_probe() for use by other drivers")
Fixes: cbce7900598c ("PCI: designware: Make driver arch-agnostic")
Fixes: 8c39d710363c ("PCI: aardvark: Add Aardvark PCI host controller driver")
Fixes: d3c68e0a7e34 ("PCI: faraday: Add Faraday Technology FTPCI100 PCI Host Bridge driver")
Fixes: 68a15eb7bd0c ("PCI: v3-semi: Add V3 Semiconductor PCI host driver")
Fixes: b7e78170efd4 ("PCI: versatile: Add DT-based ARM Versatile PB PCIe host driver")
Fixes: 5f6b6ccdbe1c ("PCI: xgene: Add APM X-Gene PCIe driver")
Fixes: 637cfacae96f ("PCI: mediatek: Add MediaTek PCIe host controller support")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@xxxxxxxxxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx

Let me know if you want me to take this, Lorenzo, otherwise:
s/pci: fix/PCI: Fix/ and

Acked-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>

Thank you Bjorn, yes it could go in as a fix but IMO it has to be split,
more so given the stable tag (and I think that each "Fixes" tag should
be self-contained),

It cannot be self-contained because it'll depend on the initial commit adding devm_pci_remap_iobase(). If you mean finding the earliest broken driver and introduce the deviec managed API while fixing it and then make use of that
API in the subsequent patches, that surely can be done.

merging it as-is would give Greg (and us) a
headache when it comes to backporting it.

The patch interdependency would give him headache too, and I was hoping to relieve those with the monilitic patch. :-)

Honestly I think it is best to split it up and send it for v4.19 but
I am happy to hear other options.

I disagree about 4.19. The R-Car PCIe situation is as follows: given me missing to get the PHY driver merged into 4.18 (and the gen3 PCIe stuff successfully merged into 4.18), the user is bound to have PCIe not working (if he doesn't refer to the PHY driver in DT) or encounter a kernel BUG (if he does refer to the PHY driver), thus I'd like this BUG to be fixed in 4.18 time frame... Note that any PCI driver having signaled a probe deferral will also hit this BUG., not just R-Car one.

Lorenzo

MBR, Sergei



[Index of Archives]     [Linux Samsung SOC]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux