On 12/16/2022 11:10 AM, Alex Williamson wrote: > On Fri, 16 Dec 2022 10:42:13 -0500 > Steven Sistare <steven.sistare@xxxxxxxxxx> wrote: > >> On 12/16/2022 9:09 AM, Jason Gunthorpe wrote: >>> On Thu, Dec 15, 2022 at 01:56:59PM -0800, Steve Sistare wrote: >>>> When a vfio container is preserved across exec, the task does not change, >>>> but it gets a new mm with locked_vm=0. If the user later unmaps a dma >>>> mapping, locked_vm underflows to a large unsigned value, and a subsequent >>>> dma map request fails with ENOMEM in __account_locked_vm. >>>> >>>> To avoid underflow, grab and save the mm at the time a dma is mapped. >>>> Use that mm when adjusting locked_vm, rather than re-acquiring the saved >>>> task's mm, which may have changed. If the saved mm is dead, do nothing. >>>> >>>> Signed-off-by: Steve Sistare <steven.sistare@xxxxxxxxxx> >>>> --- >>>> drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++------- >>>> 1 file changed, 10 insertions(+), 7 deletions(-) >>> >>> Add fixes lines and a CC stable >> >> This predates the update vaddr functionality, so AFAICT: >> >> Fixes: 73fa0d10d077 ("vfio: Type1 IOMMU implementation") >> >> I'll wait on cc'ing stable until alex has chimed in. > > Technically, adding the stable Cc tag is still the correct approach per > the stable process docs, but the Fixes: tag alone is generally > sufficient to crank up the backport engines. The original > implementation is probably the correct commit to identify, exec was > certainly not considered there. Thanks, Should I cc stable on the whole series, or re-send individually? If the latter, which ones? - Steve >>> The subject should be more like 'vfio/typ1: Prevent corruption of mm->locked_vm via exec()' >> >> Underflow is a more precise description of the first corruption. How about: >> >> vfio/type1: Prevent underflow of locked_vm via exec() >> >>>> @@ -1687,6 +1689,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, >>>> get_task_struct(current->group_leader); >>>> dma->task = current->group_leader; >>>> dma->lock_cap = capable(CAP_IPC_LOCK); >>>> + dma->mm = dma->task->mm; >>> >>> This should be current->mm, current->group_leader->mm is not quite the >>> same thing (and maybe another bug, I'm not sure) >> >> When are they different -- when the leader is a zombie? >> >> BTW I just noticed I need to update the comments about mm preceding these lines. >> >> - Steve >> >