Hi Jason, On Thu, 4 Mar 2021 15:02:53 -0400, Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > On Thu, Mar 04, 2021 at 11:01:44AM -0800, Jacob Pan wrote: > > > > For something like qemu I'd expect to put the qemu process in a cgroup > > > with 1 PASID. Who cares what qemu uses the PASID for, or how it was > > > allocated? > > > > For vSVA, we will need one PASID per guest process. But that is up to > > the admin based on whether or how many SVA capable devices are directly > > assigned. > > I hope the virtual IOMMU driver can communicate the PASID limit and > the cgroup machinery in the guest can know what the actual limit is. > For VT-d, emulated vIOMMU can communicate with the guest IOMMU driver on how many PASID bits are supported (extended cap reg PASID size fields). But it cannot communicate how many PASIDs are in the pool(host cgroup capacity). The QEMU process may not be the only one in a cgroup so it cannot give hard guarantees. I don't see a good way to communicate accurately at runtime as the process migrates or limit changes. We were thinking to adopt the "Limits" model as defined in the cgroup-v2 doc. " Limits ------ A child can only consume upto the configured amount of the resource. Limits can be over-committed - the sum of the limits of children can exceed the amount of resource available to the parent. " So the guest cgroup would still think it has full 20 bits of PASID at its disposal. But PASID allocation may fail before reaching the full 20 bits (2M). Similar on the host side, we only enforce the limit set by the cgroup but not guarantee it. > I was thinking of a case where qemu is using a single PASID to setup > the guest kVA or similar > got it. > Jason Thanks, Jacob