On Tue, Jan 03, 2023 at 01:12:53PM -0500, Steven Sistare wrote: > On 1/3/2023 10:20 AM, Jason Gunthorpe wrote: > > On Tue, Dec 20, 2022 at 12:39:20PM -0800, Steve Sistare wrote: > >> When a vfio container is preserved across exec, the task does not change, > >> but it gets a new mm with locked_vm=0, and loses the count from existing > >> dma mappings. If the user later unmaps a dma mapping, locked_vm underflows > >> to a large unsigned value, and a subsequent dma map request fails with > >> ENOMEM in __account_locked_vm. > >> > >> To avoid underflow, grab and save the mm at the time a dma is mapped. > >> Use that mm when adjusting locked_vm, rather than re-acquiring the saved > >> task's mm, which may have changed. If the saved mm is dead, do nothing. > >> > >> locked_vm is incremented for existing mappings in a subsequent patch. > >> > >> Fixes: 73fa0d10d077 ("vfio: Type1 IOMMU implementation") > >> Cc: stable@xxxxxxxxxxxxxxx > >> Signed-off-by: Steve Sistare <steven.sistare@xxxxxxxxxx> > >> Reviewed-by: Kevin Tian <kevin.tian@xxxxxxxxx> > >> --- > >> drivers/vfio/vfio_iommu_type1.c | 27 +++++++++++---------------- > >> 1 file changed, 11 insertions(+), 16 deletions(-) > >> > >> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > >> index 144f5bb..71f980b 100644 > >> --- a/drivers/vfio/vfio_iommu_type1.c > >> +++ b/drivers/vfio/vfio_iommu_type1.c > >> @@ -100,6 +100,7 @@ struct vfio_dma { > >> struct task_struct *task; > >> struct rb_root pfn_list; /* Ex-user pinned pfn list */ > >> unsigned long *bitmap; > >> + struct mm_struct *mm; > >> }; > >> > >> struct vfio_batch { > >> @@ -420,8 +421,8 @@ static int vfio_lock_acct(struct vfio_dma *dma, long npage, bool async) > >> if (!npage) > >> return 0; > >> > >> - mm = async ? get_task_mm(dma->task) : dma->task->mm; > >> - if (!mm) > >> + mm = dma->mm; > >> + if (async && !mmget_not_zero(mm)) > >> return -ESRCH; /* process exited */ > > > > Just delete the async, the lock_acct always acts on the dma which > > always has a singular mm. > > > > FIx the few callers that need it to do the mmget_no_zero() before > > calling in. > > Most of the callers pass async=true: > ret = vfio_lock_acct(dma, lock_acct, false); > vfio_lock_acct(dma, locked - unlocked, true); > ret = vfio_lock_acct(dma, 1, true); > vfio_lock_acct(dma, -unlocked, true); > vfio_lock_acct(dma, -1, true); > vfio_lock_acct(dma, -unlocked, true); > ret = mm_lock_acct(task, mm, lock_cap, npage, false); > mm_lock_acct(dma->task, dma->mm, dma->lock_cap, -npage, true); > vfio_lock_acct(dma, locked - unlocked, true); Seems like if you make a lock_sub_acct() function that does the -1* and does the mmget it will be OK? Jason