On Wed, Feb 26, 2020 at 01:59:33PM +0000, Jonathan Cameron wrote: > > +static int iopf_complete(struct device *dev, struct iopf_fault *iopf, > > + enum iommu_page_response_code status) > > This is called once per group. Should name reflect that? Ok [...] > > +/** > > + * iommu_queue_iopf - IO Page Fault handler > > + * @evt: fault event > > + * @cookie: struct device, passed to iommu_register_device_fault_handler. > > + * > > + * Add a fault to the device workqueue, to be handled by mm. > > + * > > + * Return: 0 on success and <0 on error. > > + */ > > +int iommu_queue_iopf(struct iommu_fault *fault, void *cookie) > > +{ > > + int ret; > > + struct iopf_group *group; > > + struct iopf_fault *iopf, *next; > > + struct iopf_device_param *iopf_param; > > + > > + struct device *dev = cookie; > > + struct iommu_param *param = dev->iommu_param; > > + > > + if (WARN_ON(!mutex_is_locked(¶m->lock))) > > + return -EINVAL; > > Just curious... > > Why do we always need a runtime check on this rather than say, > using lockdep_assert_held or similar? I probably didn't know about lockdep_assert at the time :) > > + /* > > + * It is incredibly easy to find ourselves in a deadlock situation if > > + * we're not careful, because we're taking the opposite path as > > + * iommu_queue_iopf: > > + * > > + * iopf_queue_flush_dev() | PRI queue handler > > + * lock(¶m->lock) | iommu_queue_iopf() > > + * queue->flush() | lock(¶m->lock) > > + * wait PRI queue empty | > > + * > > + * So we can't hold the device param lock while flushing. Take a > > + * reference to the device param instead, to prevent the queue from > > + * going away. > > + */ > > + mutex_lock(¶m->lock); > > + iopf_param = param->iopf_param; > > + if (iopf_param) { > > + queue = param->iopf_param->queue; > > + iopf_param->busy = true; > > Describing this as taking a reference is not great... > I'd change the comment to set a flag or something like that. > > Is there any potential of multiple copies of this running against > each other? I've not totally gotten my head around when this > might be called yet. Yes it's allowed, this should be a refcount [...] > > +int iopf_queue_remove_device(struct iopf_queue *queue, struct device *dev) > > +{ > > + int ret = -EINVAL; > > + struct iopf_fault *iopf, *next; > > + struct iopf_device_param *iopf_param; > > + struct iommu_param *param = dev->iommu_param; > > + > > + if (!param || !queue) > > + return -EINVAL; > > + > > + do { > > + mutex_lock(&queue->lock); > > + mutex_lock(¶m->lock); > > + iopf_param = param->iopf_param; > > + if (iopf_param && iopf_param->queue == queue) { > > + if (iopf_param->busy) { > > + ret = -EBUSY; > > + } else { > > + list_del(&iopf_param->queue_list); > > + param->iopf_param = NULL; > > + ret = 0; > > + } > > + } > > + mutex_unlock(¶m->lock); > > + mutex_unlock(&queue->lock); > > + > > + /* > > + * If there is an ongoing flush, wait for it to complete and > > + * then retry. iopf_param isn't going away since we're the only > > + * thread that can free it. > > + */ > > + if (ret == -EBUSY) > > + wait_event(iopf_param->wq_head, !iopf_param->busy); > > + else if (ret) > > + return ret; > > + } while (ret == -EBUSY); > > I'm in two minds about the next comment (so up to you)... > > Currently this looks a bit odd. Would you be better off just having a separate > parameter for busy and explicit separate handling for the error path? > > bool busy; > int ret = 0; > > do { > mutex_lock(&queue->lock); > mutex_lock(¶m->lock); > iopf_param = param->iopf_param; > if (iopf_param && iopf_param->queue == queue) { > busy = iopf_param->busy; > if (!busy) { > list_del(&iopf_param->queue_list); > param->iopf_param = NULL; > } > } else { > ret = -EINVAL; > } > mutex_unlock(¶m->lock); > mutex_unlock(&queue->lock); > if (ret) > return ret; > if (busy) > wait_event(iopf_param->wq_head, !iopf_param->busy); > > } while (busy); > > .. Sure, I think it looks better Thanks, Jean