RE: [PATCH V4 2/5] vfio/type1: prevent locked_vm underflow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Steve Sistare <steven.sistare@xxxxxxxxxx>
> Sent: Thursday, December 15, 2022 5:25 AM
> 
> When a vfio container is preserved across exec using the
> VFIO_UPDATE_VADDR
> interfaces, locked_vm of the new mm becomes 0.  If the user later unmaps a
> dma mapping, locked_vm underflows to a large unsigned value, and a
> subsequent dma map request fails with ENOMEM in __account_locked_vm.

Isn't it the problem more than underflow? w/o copying locked_vm to the
new mm it means rlimit of the new mm could be exceeded since only
newly pinned pages after exec are counted.

> @@ -424,6 +425,10 @@ static int vfio_lock_acct(struct vfio_dma *dma, long
> npage, bool async)
>  	if (!mm)
>  		return -ESRCH; /* process exited */
> 
> +	/* Avoid locked_vm underflow */
> +	if (dma->mm != mm && npage < 0)
> +		goto out;
> +

the comment is unclear why the condition will lead to underflow.

It's also unclear why the guard is only for exec case but not fork-exec.

According to this patch locked_vm was not copied to the new mm in
both cases before this fix. Then why would underflow only happen
in one but not in the other?

Anyway more explanation is appreciated here.

> +static int vfio_change_dma_owner(struct vfio_dma *dma)
> +{
> +	int ret = 0;
> +	struct mm_struct *mm = get_task_mm(dma->task);

should this be current->group_leader? otherwise in fork-exec case
dma->mm is still equal to dma->task->mm.

> +
> +	if (dma->mm != mm) {
> +		long npage = dma->size >> PAGE_SHIFT;
> +		bool new_lock_cap = capable(CAP_IPC_LOCK);
> +		struct task_struct *new_task = current->group_leader;
> +
> +		ret = mmap_write_lock_killable(new_task->mm);
> +		if (ret)
> +			goto out;
> +
> +		ret = __account_locked_vm(new_task->mm, npage, true,
> +					  new_task, new_lock_cap);
> +		mmap_write_unlock(new_task->mm);
> +		if (ret)
> +			goto out;
> +
> +		if (dma->task != new_task) {
> +			vfio_lock_acct(dma, -npage, 0);

Presumably this should be always done on the old mm, even in exec case.




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux