Hi, Reinette, > On driver unload any pending descriptors are flushed and pending DMA > descriptors are explicitly completed: > idxd_dmaengine_drv_remove() -> > drv_disable_wq() -> > idxd_wq_free_irq() -> > idxd_flush_pending_descs() -> > idxd_dma_complete_txd() > > With this done during driver unload any remaining descriptor is likely stuck and > can be dropped. Even so, the descriptor may still have a callback set that could > no longer be accessible. An example of such a problem is when the dmatest fails > and the dmatest module is unloaded. The failure of dmatest leaves descriptors > with dma_async_tx_descriptor::callback pointing to code that no longer exist. > This causes a page fault as below at the time the IDXD driver is unloaded when it > attempts to run the callback: > BUG: unable to handle page fault for address: ffffffffc0665190 > #PF: supervisor instruction fetch in kernel mode > #PF: error_code(0x0010) - not-present page > > Fix this by clearing the callback pointers on the transmit descriptors only when > workqueue is disabled. > > Signed-off-by: Reinette Chatre <reinette.chatre@xxxxxxxxx> > --- > > History of refactoring made the Fixes: hard to identify by me. > > drivers/dma/idxd/device.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c index > b4d7bb923a40..2ac71a34fa34 100644 > --- a/drivers/dma/idxd/device.c > +++ b/drivers/dma/idxd/device.c > @@ -1156,6 +1156,7 @@ int idxd_device_load_config(struct idxd_device *idxd) > > static void idxd_flush_pending_descs(struct idxd_irq_entry *ie) { > + struct dma_async_tx_descriptor *tx; Nitpicking. It's better to move this line to below: > struct idxd_desc *desc, *itr; > struct llist_node *head; > LIST_HEAD(flist); > @@ -1175,6 +1176,15 @@ static void idxd_flush_pending_descs(struct > idxd_irq_entry *ie) > list_for_each_entry_safe(desc, itr, &flist, list) { here? + struct dma_async_tx_descriptor *tx; > list_del(&desc->list); > ctype = desc->completion->status ? > IDXD_COMPLETE_NORMAL : IDXD_COMPLETE_ABORT; > + /* > + * wq is being disabled. Any remaining descriptors are > + * likely to be stuck and can be dropped. callback could > + * point to code that is no longer accessible, for example > + * if dmatest module has been unloaded. > + */ > + tx = &desc->txd; > + tx->callback = NULL; > + tx->callback_result = NULL; > idxd_dma_complete_txd(desc, ctype, true); > } > } > -- > 2.34.1 Reviewed-by: Fenghua Yu <fenghua.yu@xxxxxxxxx> Thanks. -Fenghua