Re: [PATCH 1/5] vfio/type1: use pinned_vm instead of locked_vm to account pinned pages

Daniel Jordan <daniel.m.jordan@xxxxxxxxxx> · Mon, 11 Feb 2019 18:11:53 -0500

On Mon, Feb 11, 2019 at 03:56:20PM -0700, Jason Gunthorpe wrote:
> On Mon, Feb 11, 2019 at 05:44:33PM -0500, Daniel Jordan wrote:
> > @@ -266,24 +267,15 @@ static int vfio_lock_acct(struct vfio_dma *dma, long npage, bool async)
> >  	if (!mm)
> >  		return -ESRCH; /* process exited */
> >  
> > -	ret = down_write_killable(&mm->mmap_sem);
> > -	if (!ret) {
> > -		if (npage > 0) {
> > -			if (!dma->lock_cap) {
> > -				unsigned long limit;
> > -
> > -				limit = task_rlimit(dma->task,
> > -						RLIMIT_MEMLOCK) >> PAGE_SHIFT;
> > +	pinned_vm = atomic64_add_return(npage, &mm->pinned_vm);
> >  
> > -				if (mm->locked_vm + npage > limit)
> > -					ret = -ENOMEM;
> > -			}
> > +	if (npage > 0 && !dma->lock_cap) {
> > +		unsigned long limit = task_rlimit(dma->task, RLIMIT_MEMLOCK) >>
> > +
> > -					PAGE_SHIFT;
> 
> I haven't looked at this super closely, but how does this stuff work?
> 
> do_mlock doesn't touch pinned_vm, and this doesn't touch locked_vm...
> 
> Shouldn't all this be 'if (locked_vm + pinned_vm < RLIMIT_MEMLOCK)' ?
>
> Otherwise MEMLOCK is really doubled..

So this has been a problem for some time, but it's not as easy as adding them
together, see [1][2] for a start.

The locked_vm/pinned_vm issue definitely needs fixing, but all this series is
trying to do is account to the right counter.

Daniel

[1] http://lkml.kernel.org/r/20130523104154.GA23650@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
[2] http://lkml.kernel.org/r/20130524140114.GK23650@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx