On Tue, Nov 21, 2023 at 08:23:16PM -0800, Daniel Stodden wrote: > A PCI device hot removal may occur while stdev->cdev is held open. The call > to stdev_release() then happens during close or exit, at a point way past > switchtec_pci_remove(). Otherwise the last ref would vanish with the > trailing put_device(), just before return. > > At that later point in time, the devm cleanup has already removed the > stdev->mmio_mrpc mapping. Also, the stdev->pdev reference was not a counted > one. Therefore, in DMA mode, the iowrite32() in stdev_release() will cause > a fatal page fault, and the subsequent dma_free_coherent(), if reached, > would pass a stale &stdev->pdev->dev pointer. > > Fix by moving MRPC DMA shutdown into switchtec_pci_remove(), after > stdev_kill(). Counting the stdev->pdev ref is now optional, but may prevent > future accidents. > > Reproducible via the script at > https://lore.kernel.org/r/20231113212150.96410-1-dns@xxxxxxxxxx > > Link: https://lore.kernel.org/r/20231113212150.96410-2-dns@xxxxxxxxxx > Signed-off-by: Daniel Stodden <dns@xxxxxxxxxx> > Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > Reviewed-by: Logan Gunthorpe <logang@xxxxxxxxxxxx> > Reviewed-by: Dmitry Safonov <dima@xxxxxxxxxx> Oh, I forgot to mention: for future reference, you should only add Signed-off-by when you create the patch or you receive it from somebody else and are passing it on. You should not add the Signed-off-by of the person you're sending it *to*, because that person will add their own Signed-off-by when they process it. E.g., I apply patches with "git am --signoff" which adds my Signed-off-by, which would result in a duplicate. No worries, I took care of it so there's no duplicate for me :) Bjorn