RE: [PATCH V6 2/7] vfio/type1: prevent underflow of locked_vm via exec()

"Tian, Kevin" <kevin.tian@xxxxxxxxx> · Mon, 19 Dec 2022 07:48:56 +0000

> From: Steve Sistare <steven.sistare@xxxxxxxxxx>
> Sent: Saturday, December 17, 2022 2:51 AM
> 
> When a vfio container is preserved across exec, the task does not change,
> but it gets a new mm with locked_vm=0.  If the user later unmaps a dma
> mapping, locked_vm underflows to a large unsigned value, and a subsequent
> dma map request fails with ENOMEM in __account_locked_vm.
> 
> To avoid underflow, grab and save the mm at the time a dma is mapped.
> Use that mm when adjusting locked_vm, rather than re-acquiring the saved
> task's mm, which may have changed.  If the saved mm is dead, do nothing.

worth clarifying that locked_vm of the new mm is still not fixed.

> @@ -1664,15 +1666,7 @@ static int vfio_dma_do_map(struct vfio_iommu
> *iommu,
>  	 * against the locked memory limit and we need to be able to do both
>  	 * outside of this call path as pinning can be asynchronous via the
>  	 * external interfaces for mdev devices.  RLIMIT_MEMLOCK requires
> a
> -	 * task_struct and VM locked pages requires an mm_struct, however
> -	 * holding an indefinite mm reference is not recommended,
> therefore we
> -	 * only hold a reference to a task.  We could hold a reference to
> -	 * current, however QEMU uses this call path through vCPU threads,
> -	 * which can be killed resulting in a NULL mm and failure in the
> unmap
> -	 * path when called via a different thread.  Avoid this problem by
> -	 * using the group_leader as threads within the same group require
> -	 * both CLONE_THREAD and CLONE_VM and will therefore use the
> same
> -	 * mm_struct.
> +	 * task_struct and VM locked pages requires an mm_struct.

IMHO the rationale why choosing group_leader still applies...

otherwise this looks good to me:

Reviewed-by: Kevin Tian <kevin.tian@xxxxxxxxx>