Re: [PATCH V2 2/5] vfio/type1: prevent locked_vm underflow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/13/2022 3:23 PM, Alex Williamson wrote:
> On Tue, 13 Dec 2022 11:40:56 -0800
> Steve Sistare <steven.sistare@xxxxxxxxxx> wrote:
> 
>> When a vfio container is preserved across exec using the VFIO_UPDATE_VADDR
>> interfaces, locked_vm of the new mm becomes 0.  If the user later unmaps a
>> dma mapping, locked_vm underflows to a large unsigned value, and a
>> subsequent dma map request fails with ENOMEM in __account_locked_vm.
>>
>> To avoid underflow, do not decrement locked_vm during unmap if the
>> dma's mm has changed.  To restore the correct locked_vm count, when
>> VFIO_DMA_MAP_FLAG_VADDR is used and the dma's mm has changed, add
>> the mapping's pinned page count to the new mm->locked_vm, subject
>> to the rlimit.  Now that mediated devices are excluded when using
>> VFIO_UPDATE_VADDR, the amount of pinned memory equals the size of
>> the mapping.
> 
> Fixes: c3cbab24db38 ("vfio/type1: implement interfaces to update vaddr")
> 
> 
>> Signed-off-by: Steve Sistare <steven.sistare@xxxxxxxxxx>
>> ---
>>  drivers/vfio/vfio_iommu_type1.c | 23 +++++++++++++++++++----
>>  1 file changed, 19 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>> index 80bdb4d..35a1a52 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -100,6 +100,7 @@ struct vfio_dma {
>>  	struct task_struct	*task;
>>  	struct rb_root		pfn_list;	/* Ex-user pinned pfn list */
>>  	unsigned long		*bitmap;
>> +	struct mm_struct	*mm;
>>  };
>>  
>>  struct vfio_batch {
>> @@ -1165,7 +1166,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
>>  					    &iotlb_gather);
>>  	}
>>  
>> -	if (do_accounting) {
>> +	if (do_accounting && current->mm == dma->mm) {
> 
> 
> This seems incompatible with ffed0518d871 ("vfio: remove useless
> judgement") where we no longer assume that the unmap mm is the same as
> the mapping mm.

They are compatible.  My fix allows another task to unmap, but only decreases
locked_vm if the current mm matches the original mm that locked it.  And the
"original" mm is updated by MAP_FLAG_VADDR.

> Does this need to get_task_mm(dma->task) and compare that mm to dma->mm
> to determine whether an exec w/o vaddr remapping has occurred?  That's
> the only use case I can figure out where grabbing the mm for dma->mm
> actually makes any sense at all.

The mm grab does detect an exec.  Before exec, at map time, we get task and grab
its mm.  During exec, task gets a new mm.  The old mm becomes defunct, but we
still hold it and can examine its pointer address.

The new code does not require that current == dma->task.

>>  		vfio_lock_acct(dma, -unlocked, true);
>>  		return 0;
>>  	}
>> @@ -1178,6 +1179,7 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
>>  	vfio_unmap_unpin(iommu, dma, true);
>>  	vfio_unlink_dma(iommu, dma);
>>  	put_task_struct(dma->task);
>> +	mmdrop(dma->mm);
>>  	vfio_dma_bitmap_free(dma);
>>  	if (dma->vaddr_invalid) {
>>  		iommu->vaddr_invalid_count--;
>> @@ -1623,9 +1625,20 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>>  			   dma->size != size) {
>>  			ret = -EINVAL;
>>  		} else {
>> -			dma->vaddr = vaddr;
>> -			dma->vaddr_invalid = false;
>> -			iommu->vaddr_invalid_count--;
>> +			if (current->mm != dma->mm) {
>> +				ret = vfio_lock_acct(dma, size >> PAGE_SHIFT,
>> +						     0);
>> +				if (!ret) {
>> +					mmdrop(dma->mm);
>> +					dma->mm = current->mm;
>> +					mmgrab(dma->mm);
>> +				}
>> +			}
>> +			if (!ret) {
>> +				dma->vaddr = vaddr;
>> +				dma->vaddr_invalid = false;
>> +				iommu->vaddr_invalid_count--;
>> +			}
> 
> Poor flow, shouldn't this be:
> 
> 			if (current->mm != dma->mm) {
> 				ret = vfio_lock_acct(dma,
> 						     size >> PAGE_SHIFT, 0);
> 				if (ret)
> 					goto out_unlock;
> 
> 				mmdrop(dma->mm);
> 				dma->mm = current->mm;
> 				mmgrab(dma->mm);
> 			}
> 			dma->vaddr = vaddr;
> 			dma->vaddr_invalid = false;
> 			iommu->vaddr_invalid_count--;

Better, will do, thanks.

- Steve

>>  			wake_up_all(&iommu->vaddr_wait);
>>  		}
>>  		goto out_unlock;
>> @@ -1683,6 +1696,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>>  	get_task_struct(current->group_leader);
>>  	dma->task = current->group_leader;
>>  	dma->lock_cap = capable(CAP_IPC_LOCK);
>> +	dma->mm = dma->task->mm;
>> +	mmgrab(dma->mm);
>>  
>>  	dma->pfn_list = RB_ROOT;
>>  
> 



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux