Re: [PATCH v1] PCI: starfive: Fix kmemleak in StarFive PCIe driver's IRQ handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Manivannan

On Fri, 14 Feb 2025 at 11:39, Manivannan Sadhasivam
<manivannan.sadhasivam@xxxxxxxxxx> wrote:
>
> On Tue, Feb 11, 2025 at 01:09:04AM +0530, Anand Moon wrote:
> > Hi Manivannan
> >
> > On Mon, 10 Feb 2025 at 23:14, Manivannan Sadhasivam
> > <manivannan.sadhasivam@xxxxxxxxxx> wrote:
> > >
> > > On Sat, Feb 08, 2025 at 07:31:08PM +0530, Anand Moon wrote:
> > > > kmemleak reported following debug log
> > > >
> > > > $ sudo cat /sys/kernel/debug/kmemleak
> > > > unreferenced object 0xffffffd6c47c2600 (size 128):
> > > >   comm "kworker/u16:2", pid 38, jiffies 4294942263
> > > >   hex dump (first 32 bytes):
> > > >     cc 7c 5a 8d ff ff ff ff 40 b0 47 c8 d6 ff ff ff  .|Z.....@.G.....
> > > >     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > > >   backtrace (crc 4f07ff07):
> > > >     __create_object+0x2a/0xfc
> > > >     kmemleak_alloc+0x38/0x98
> > > >     __kmalloc_cache_noprof+0x296/0x444
> > > >     request_threaded_irq+0x168/0x284
> > > >     devm_request_threaded_irq+0xa8/0x13c
> > > >     plda_init_interrupts+0x46e/0x858
> > > >     plda_pcie_host_init+0x356/0x468
> > > >     starfive_pcie_probe+0x2f6/0x398
> > > >     platform_probe+0x106/0x150
> > > >     really_probe+0x30e/0x746
> > > >     __driver_probe_device+0x11c/0x2c2
> > > >     driver_probe_device+0x5e/0x316
> > > >     __device_attach_driver+0x296/0x3a4
> > > >     bus_for_each_drv+0x1d0/0x260
> > > >     __device_attach+0x1fa/0x2d6
> > > >     device_initial_probe+0x14/0x28
> > > > unreferenced object 0xffffffd6c47c2900 (size 128):
> > > >   comm "kworker/u16:2", pid 38, jiffies 4294942281
> > > >
> > > > This patch addresses a kmemleak reported during StarFive PCIe driver
> > > > initialization. The leak was observed with kmemleak reporting
> > > > unreferenced objects related to IRQ handling. The backtrace pointed
> > > > to the `request_threaded_irq` and related functions within the
> > > > `plda_init_interrupts` path, indicating that some allocated memory
> > > > related to IRQs was not being properly freed.
> > > >
> > > > The issue was that while the driver requested IRQs, it wasn't
> > > > correctly handling the lifecycle of the associated resources.
> > >
> > > What resources?
> > >
> > The Microchip PCIe host driver [1] handles  PCI, SEC, DEBUG, and LOCAL
> > hardware events
> > through a dedicated framework [2]. This framework allows the core driver [3]
> > to monitor and wait for these specific events.
> >
>
> Microchip driver also has its own 'event_ops' and 'event_irq_chip', so defining
> 'request_event_irq()' callback makes sense to me.
>
> > [1]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-microchip-host.c#L90-L292
> > [2]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-microchip-host.c#L374-L499
> > [3]: https://github.com/torvalds/linux/blob/master/drivers/pci/controller/plda/pcie-plda-host.c#L448-L466
> >
> > StarFive PCIe driver currently lacks the necessary `request_event_irq`
> > implementation
> > to integrate with this event-handling mechanism, which prevents the core driver
> > from properly responding to these events on StarFive platforms.
> >
> > static const struct plda_event mc_event = {
> > .  request_event_irq = mc_request_event_irq,
> >   .intx_event        = EVENT_LOCAL_PM_MSI_INT_INTX,
> >   .msi_event         = EVENT_LOCAL_PM_MSI_INT_MSI,
> > };
> >
> > This patch implements dummy `request_event_irq` for the StarFive PCIe driver,
> > Since the core [3] driver is monitoring for these events
> >
>
> This still doesn't make sense to me. Under what condition you observed the
> kmemleak? Since it points to devm_request_irq(), I can understand that the
> memory allocated for the IRQ is not freed. But when does it happen?
>
The PCIe host driver is responsible for monitoring the PLDA PCIe EVENT irq.
However, I have been unable to locate the register map necessary to
implement the
required code updates,

alarm@archriscv:/media/nvme0/mainline/linux-riscv-6.y-devel$ cat
/proc/interrupts
           CPU0       CPU1       CPU2       CPU3
 10:     101145      76126      79040      80218 RISC-V INTC   5 Edge
    riscv-timer
 12:          3          0          0          0 SiFive PLIC 111 Edge
    17030000.power-controller
 13:         20          0          0          0 SiFive PLIC  30 Edge
    1600c000.rng
 14:          0          0          0          0 SiFive PLIC   1 Edge
    ccache_ecc
 15:          0          0          0          0 SiFive PLIC   3 Edge
    ccache_ecc
 16:          0          0          0          0 SiFive PLIC   4 Edge
    ccache_ecc
 17:          0          0          0          0 SiFive PLIC   2 Edge
    ccache_ecc
 20:          0          0          0          0 SiFive PLIC  73 Edge
    dw_axi_dmac_platform
 21:       1694          0          0          0 SiFive PLIC  32 Edge      ttyS0
 22:          0          0          0          0 SiFive PLIC  35 Edge
    10030000.i2c
 23:          0          0          0          0 SiFive PLIC  75 Edge
    dw-mci
 24:          0          0          0          0 SiFive PLIC  37 Edge
    10050000.i2c
 25:        171          0          0          0 SiFive PLIC  50 Edge
    12050000.i2c
 26:          0          0          0          0 SiFive PLIC  51 Edge
    12060000.i2c
 27:      10419          0          0          0 SiFive PLIC  74 Edge
    dw-mci
 28:          6          0          0          0 SiFive PLIC  25 Edge
    13010000.spi
 29:          0          0          0          0 SiFive PLIC  38 Edge      pl022
 31:          0          0          0          0 PLDA PCIe EVENT  16
Edge      940000000.pcie
 32:          0          0          0          0 PLDA PCIe EVENT  17
Edge      940000000.pcie
 33:          0          0          0          0 PLDA PCIe EVENT  18
Edge      940000000.pcie
 34:          0          0          0          0 PLDA PCIe EVENT  20
Edge      940000000.pcie
 35:          0          0          0          0 PLDA PCIe EVENT  21
Edge      940000000.pcie
 36:          0          0          0          0 PLDA PCIe EVENT  22
Edge      940000000.pcie
 39:          0          0          0          0 PLDA PCIe EVENT  26
Edge      940000000.pcie
 40:          0          0          0          0 PLDA PCIe EVENT  27
Edge      940000000.pcie
 41:          0          0          0          0 17020000.pinctrl  41
Edge      16020000.mmc cd
 42:          0          0          0          0 PLDA PCIe EVENT  28
Edge      940000000.pcie
 44:          0          0          0          0 PLDA PCIe MSI   0
Edge      PCIe PME, aerdrv, PCIe bwctrl
 46:          0          0          0          0 PLDA PCIe EVENT  16
Edge      9c0000000.pcie
 47:          0          0          0          0 PLDA PCIe EVENT  17
Edge      9c0000000.pcie
 48:          0          0          0          0 PLDA PCIe EVENT  18
Edge      9c0000000.pcie
 49:          0          0          0          0 PLDA PCIe EVENT  20
Edge      9c0000000.pcie
 50:          0          0          0          0 PLDA PCIe EVENT  21
Edge      9c0000000.pcie
 51:          0          0          0          0 PLDA PCIe EVENT  22
Edge      9c0000000.pcie
 54:          0          0          0          0 PLDA PCIe EVENT  26
Edge      9c0000000.pcie
 55:          0          0          0          0 PLDA PCIe EVENT  27
Edge      9c0000000.pcie
 56:          0          0          0          0 PLDA PCIe EVENT  28
Edge      9c0000000.pcie
 58:          0          0          0          0 PLDA PCIe MSI
134217728 Edge      PCIe PME, aerdrv, PCIe bwctrl

Yep, Thanks for your review comment.

The following changes fix the kernel memory leak warning at my end.

$ git diff
diff --git a/drivers/pci/controller/plda/pcie-plda-host.c
b/drivers/pci/controller/plda/pcie-plda-host.c
index 4153214ca410..d4e7c1b49607 100644
--- a/drivers/pci/controller/plda/pcie-plda-host.c
+++ b/drivers/pci/controller/plda/pcie-plda-host.c
@@ -461,6 +461,7 @@ int plda_init_interrupts(struct platform_device *pdev,

                if (ret) {
                        dev_err(dev, "failed to request IRQ %d\n", event_irq);
+                       irq_dispose_mapping(event_irq);
                        return ret;
                }
        }
> > > > This patch introduces an event IRQ handler and the necessary
> > > > infrastructure to manage these IRQs, preventing the memory leak.
> > > >
> > >
> > > These handles appear pointless to me. What purpose are they serving?
> > >
> > Yes, you are correct, the core event monitoring framework [3] triggered a
> > kernel memory leak. This patch adds a dummy IRQ callback as a
> > placeholder for proper event handling, which can be implemented in a
> > future patch.
> >
>
> The dummy request_event_irq() callback is not supposed to be needed in the first
> place. So clearly, this patch is not fixing the actual memory leak but trying to
> cover it up.
You are correct.
>
> - Mani
>
Thanks
-Anand
> --
> மணிவண்ணன் சதாசிவம்





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux