Re: [PATCH RFC 04/12] kernel/user: Allow user::locked_vm to be usable for iommufd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 22, 2022 at 03:28:22PM +0100, Niklas Schnelle wrote:
> On Fri, 2022-03-18 at 14:27 -0300, Jason Gunthorpe wrote:
> > Following the pattern of io_uring, perf, skb, and bpf iommfd will use
>                                                 iommufd ----^
> > user->locked_vm for accounting pinned pages. Ensure the value is included
> > in the struct and export free_uid() as iommufd is modular.
> > 
> > user->locked_vm is the correct accounting to use for ulimit because it is
> > per-user, and the ulimit is not supposed to be per-process. Other
> > places (vfio, vdpa and infiniband) have used mm->pinned_vm and/or
> > mm->locked_vm for accounting pinned pages, but this is only per-process
> > and inconsistent with the majority of the kernel.
> 
> Since this will replace parts of vfio this difference seems
> significant. Can you explain this a bit more?

I'm not sure what to say more, this is the correct way to account for
this. It is natural to see it is right because the ulimit is supposted
to be global to the user, not effectively reset every time the user
creates a new process.

So checking the ulimit against a per-process variable in the mm_struct
doesn't make too much sense.

> I'm also a bit confused how io_uring handles this. When I stumbled over
> the problem fixed by 6b7898eb180d ("io_uring: fix imbalanced sqo_mm
> accounting") and from that commit description I seem to rember that
> io_uring also accounts in mm->locked_vm too? 

locked/pinned_pages in the mm is kind of a debugging counter, it
indicates how many pins the user obtained through this mm. AFAICT its
only correct use is to report through proc. Things are supposed to
update it, but there is no reason to limit it as the user limit
supersedes it.

The commit you pointed at is fixing that io_uring corrupted the value.

Since VFIO checks locked/pinned_pages against the ulimit would blow up
when the value was wrong.

> In fact I stumbled over that because the wrong accounting in
> io_uring exhausted the applied to vfio (I was using a QEMU utilizing
> io_uring itself).

I'm pretty interested in this as well, do you have anything you can
share?

Thanks,
Jason



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux