On Fri, Jan 10, 2025 at 03:49:50PM -0400, Jason Gunthorpe wrote: > On Fri, Jan 10, 2025 at 11:27:53AM -0800, Nicolin Chen wrote: > > On Fri, Jan 10, 2025 at 01:48:42PM -0400, Jason Gunthorpe wrote: > > > On Tue, Jan 07, 2025 at 09:10:09AM -0800, Nicolin Chen wrote: > > > > > > > +static ssize_t iommufd_veventq_fops_read(struct iommufd_eventq *eventq, > > > > + char __user *buf, size_t count, > > > > + loff_t *ppos) > > > > +{ > > > > + size_t done = 0; > > > > + int rc = 0; > > > > + > > > > + if (*ppos) > > > > + return -ESPIPE; > > > > + > > > > + mutex_lock(&eventq->mutex); > > > > + while (!list_empty(&eventq->deliver) && count > done) { > > > > + struct iommufd_vevent *cur = list_first_entry( > > > > + &eventq->deliver, struct iommufd_vevent, node); > > > > + > > > > + if (cur->data_len > count - done) > > > > + break; > > > > + > > > > + if (copy_to_user(buf + done, cur->event_data, cur->data_len)) { > > > > + rc = -EFAULT; > > > > + break; > > > > + } > > > > > > Now that I look at this more closely, the fault path this is copied > > > from is not great. > > > > > > This copy_to_user() can block while waiting on a page fault, possibily > > > for a long time. While blocked the mutex is held and we can't add more > > > entries to the list. > > > > > > That will cause the shared IRQ handler in the iommu driver to back up, > > > which would cause a global DOS. > > > > > > This probably wants to be organized to look more like > > > > > > while (itm = eventq_get_next_item(eventq)) { > > > if (..) { > > > eventq_restore_failed_item(eventq); > > > return -1; > > > } > > > } > > > > > > Where the next_item would just be a simple spinlock across the linked > > > list manipulation. > > > > Would it be simpler by just limiting one node per read(), i.e. > > no "while (!list_empty)" and no block? > > > > The report() adds one node at a time, and wakes up the poll() > > each time of adding a node. And user space could read one event > > at a time too? > > That doesn't really help, the issue is it holds the lock over the > copy_to_user() which it is doing because it doesn't want pull the item off > the list and then try to handle the failure and put it back. Hmm, it seems that I haven't got your first narrative straight.. Would you mind elaborate "copy_to_user() can block while waiting on a page fault"? When would this happen? Thanks Nicolin