On 4/18/2017 8:24 AM, Peter Xu wrote: > On Mon, Apr 17, 2017 at 03:32:20PM -0600, Alex Williamson wrote: >> On Tue, 18 Apr 2017 01:02:12 +0530 >> Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote: >> >>> On 4/18/2017 12:49 AM, Alex Williamson wrote: >>>> On Tue, 18 Apr 2017 00:35:06 +0530 >>>> Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote: >>>> >>>>> On 4/17/2017 8:02 PM, Alex Williamson wrote: >>>>>> On Mon, 17 Apr 2017 14:47:54 +0800 >>>>>> Peter Xu <peterx@xxxxxxxxxx> wrote: >>>>>> >>>>>>> On Sun, Apr 16, 2017 at 07:42:27PM -0600, Alex Williamson wrote: >>>>>>> >>>>>>> [...] >>>>>>> >>>>>>>> -static void vfio_lock_acct(struct task_struct *task, long npage) >>>>>>>> +static int vfio_lock_acct(struct task_struct *task, long npage, bool lock_cap) >>>>>>>> { >>>>>>>> - struct vwork *vwork; >>>>>>>> struct mm_struct *mm; >>>>>>>> bool is_current; >>>>>>>> + int ret; >>>>>>>> >>>>>>>> if (!npage) >>>>>>>> - return; >>>>>>>> + return 0; >>>>>>>> >>>>>>>> is_current = (task->mm == current->mm); >>>>>>>> >>>>>>>> mm = is_current ? task->mm : get_task_mm(task); >>>>>>>> if (!mm) >>>>>>>> - return; /* process exited */ >>>>>>>> + return -ESRCH; /* process exited */ >>>>>>>> >>>>>>>> - if (down_write_trylock(&mm->mmap_sem)) { >>>>>>>> - mm->locked_vm += npage; >>>>>>>> - up_write(&mm->mmap_sem); >>>>>>>> - if (!is_current) >>>>>>>> - mmput(mm); >>>>>>>> - return; >>>>>>>> - } >>>>>>>> + ret = down_write_killable(&mm->mmap_sem); >>>>>>>> + if (!ret) { >>>>>>>> + if (npage < 0 || lock_cap) { >>>>>>> >>>>>>> Nit: maybe we can avoid passing in lock_cap in all the callers of >>>>>>> vfio_lock_acct() and fetch it via has_capability() only if npage < 0? >>>>>>> IMHO that'll keep the vfio_lock_acct() interface cleaner, and we won't >>>>>>> need to pass in "false" any time when doing unpins. >>>>>> >>>>>> Unfortunately vfio_pin_pages_remote() needs to know about lock_cap >>>>>> since it tests whether the user is exceeding their locked memory >>>>>> limit. The other callers could certainly get away with >>>>>> vfio_lock_acct() testing the capability itself but that would add a >>>>>> redundant call for the most common user. I'm not a big fan of passing >>>>>> a lock_cap bool either, but it seemed the best fix for now. The >>>>>> cleanest alternative I can up with is this (untested): >>>>>> >>>>> >>>>> In my opinion, passing 'bool lock_cap' looks much clean and simple. >>>>> >>>>> Reviewed-by: Kirti Wankhede <kwankhede@xxxxxxxxxx> >>>> >>>> Well shoot, I was just starting to warm up to the bool*. I like that >>>> we're not presuming the polarity for the callers we expect to be >>>> removing pages and I generally just dislike passing fixed bool >>>> parameters to change the function behavior. I've cleaned it up a bit >>>> further and was starting to do some testing on this which I'd propose >>>> for v5. Does it change your opinion? >>> >>> If passing fixed bool parameter is the concern then I would lean towards >>> Peter's suggestion. vfio_pin_pages_remote() will check lock capability >>> outside vfio_lock_acct() and again in vfio_lock_acct(). At other places, >>> it will be takes care within vfio_lock_acct() >> >> Sorry, I don't see that as a viable option. Testing for CAP_IPC_LOCK in >> both vfio_pin_pages_remote() and vfio_lock_acct() results in over a >> 10% performance hit on the mapping path with a custom micro-benchmark. >> In fact, it suggests we should probably pass that from even higher in >> the call stack. Thanks, > > Sorry I wasn't aware of such a performance degradation with such a > change. Then I would be perfectly fine with either current patch, or > the new one you proposed (with bool *). Thanks, > Sorry, even I wasn't aware of. Looking at v5 version now. Thanks, Kirti