On Wed, Aug 24, 2022 at 01:29:03PM -0700, Dave Jiang wrote: > > On 8/24/2022 12:29 PM, Jerry Snitselaar wrote: > > For a software reset idxd_device_reinit() is called, which will walk > > the device workqueues to see which ones were enabled, and try to > > re-enable them. It keys off wq->state being iDXD_WQ_ENABLED, but the > > first thing idxd_enable_wq() will do is see that the state of the > > workqueue is enabled, and return 0 instead of attempting to issue > > a command to enable the workqueue. > > > > So once a workqueue is found that needs to be re-enabled, > > set the state to disabled prior to calling idxd_enable_wq(). > > This would accurately reflect the state if the enable fails > > as well. > > > > Cc: Fenghua Yu <fenghua.yu@xxxxxxxxx> > > Cc: Dave Jiang <dave.jiang@xxxxxxxxx> > > Cc: Vinod Koul <vkoul@xxxxxxxxxx> > > Cc: dmaengine@xxxxxxxxxxxxxxx > > Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") > > Signed-off-by: Jerry Snitselaar <jsnitsel@xxxxxxxxxx> > > --- > > drivers/dma/idxd/irq.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/dma/idxd/irq.c b/drivers/dma/idxd/irq.c > > index 743ead5ebc57..723eeb5328d6 100644 > > --- a/drivers/dma/idxd/irq.c > > +++ b/drivers/dma/idxd/irq.c > > @@ -52,6 +52,7 @@ static void idxd_device_reinit(struct work_struct *work) > > struct idxd_wq *wq = idxd->wqs[i]; > > if (wq->state == IDXD_WQ_ENABLED) { > > + wq->state = IDXD_WQ_DISABLED; > Might be better off to insert this line in idxd_wq_disable_cleanup(). I > think that should put it in sane state. I don't think that is called in the code path that I was lookng at. I've been looking at this bit of process_misc_interrupts(): halt: gensts.bits = ioread32(idxd->reg_base + IDXD_GENSTATS_OFFSET); if (gensts.state == IDXD_DEVICE_STATE_HALT) { idxd->state = IDXD_DEV_HALTED; if (gensts.reset_type == IDXD_DEVICE_RESET_SOFTWARE) { /* * If we need a software reset, we will throw the work * on a system workqueue in order to allow interrupts * for the device command completions. */ INIT_WORK(&idxd->work, idxd_device_reinit); queue_work(idxd->wq, &idxd->work); } else { idxd->state = IDXD_DEV_HALTED; idxd_wqs_quiesce(idxd); idxd_wqs_unmap_portal(idxd); spin_lock(&idxd->dev_lock); idxd_device_clear_state(idxd); dev_err(&idxd->pdev->dev, "idxd halted, need %s.\n", gensts.reset_type == IDXD_DEVICE_RESET_FLR ? "FLR" : "system reset"); spin_unlock(&idxd->dev_lock); return -ENXIO; } } return 0; } So it sees that the device is halted, and sticks idxd_device_reinint() on that workqueue. The idxd_device_reinit() has this loop to re-enable the idxd wqs: for (i = 0; i < idxd->max_wqs; i++) { struct idxd_wq *wq = idxd->wqs[i]; if (wq->state == IDXD_WQ_ENABLED) { wq->state = IDXD_WQ_DISABLED; rc = idxd_wq_enable(wq); if (rc < 0) { dev_warn(dev, "Unable to re-enable wq %s\n", dev_name(wq_confdev(wq))); } } } Once you go into idxd_wq_enable() though you get this check at the beginning: if (wq->state == IDXD_WQ_ENABLED) { dev_dbg(dev, "WQ %d already enabled\n", wq->id); return 0; } So IIUC it sees the device is halted, goes to reset it, figures out a wq should be re-enabled, calls idxd_wq_enable() which hits the check, returns 0 and the wq is never really re-enabled, though it will still have wq state set to IDXD_WQ_ENABLED. Or am I missing something? Regards, Jerry > > rc = idxd_wq_enable(wq); > > if (rc < 0) { > > dev_warn(dev, "Unable to re-enable wq %s\n",