Re: [RFC PATCH v2 12/22] iommufd: Allow mapping from guest_memfd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 19, 2025 at 09:35:16AM -0400, Jason Gunthorpe wrote:
> On Wed, Feb 19, 2025 at 11:43:46AM +1100, Alexey Kardashevskiy wrote:
> > On 19/2/25 10:51, Jason Gunthorpe wrote:
> > > On Wed, Feb 19, 2025 at 10:35:28AM +1100, Alexey Kardashevskiy wrote:
> > > 
> > > > With in-place conversion, we could map the entire guest once in the HV IOMMU
> > > > and control the Cbit via the guest's IOMMU table (when available). Thanks,
> > > 
> > > Isn't it more complicated than that? I understood you need to have a
> > > IOPTE boundary in the hypervisor at any point where the guest Cbit
> > > changes - so you can't just dump 1G hypervisor pages to cover the
> > > whole VM, you have to actively resize ioptes?
> > 
> > When the guest Cbit changes, only AMD RMP table requires update but not
> > necessaryly NPT or IOPTEs.
> > (I may have misunderstood the question, what meaning does "dump 1G pages"
> > have?).
> 
> AFAIK that is not true, if there are mismatches in page size, ie the
> RMP is 2M and the IOPTE is 1G then things do not work properly.

Just for clarity: at least for normal/nested page table (but I'm
assuming the same applies to IOMMU mappings), 1G mappings are
handled similarly as 2MB mappings as far as RMP table checks are
concerned: each 2MB range is checked individually as if it were
a separate 2MB mapping:

AMD Architecture Programmer's Manual Volume 2, 15.36.10,
"RMP and VMPL Access Checks":

  "Accesses to 1GB pages only install 2MB TLB entries when SEV-SNP is
  enabled, therefore this check treats 1GB accesses as 2MB accesses for
  purposes of this check."

So a 1GB mapping doesn't really impose more restrictions than a 2MB
mapping (unless there's something different about how RMP checks are
done for IOMMU).

But the point still stands for 4K RMP entries and 2MB mappings: a 2MB
mapping either requires private page RMP entries to be 2MB, or in the
case of 2MB mapping of shared pages, every page in the range must be
shared according to the corresponding RMP entries.

> 
> It is why we had to do this:

I think, for the non-SEV-TIO use-case, it had more to do with inability
to unmap a 4K range once a particular 4K page has been converted
from shared to private if it was originally installed via a 2MB IOPTE,
since the guest could actively be DMA'ing to other shared pages in the
2M range (but we can be assured it is not DMA'ing to a particular 4K
page it has converted to private), and the IOMMU doesn't (AFAIK) have
a way to atomically split an existing 2MB IOPTE to avoid this. So
forcing everything to 4K ends up being necessary since we don't know
in advance what ranges might contain 4K pages that will get converted
to private in the future by the guest.

SEV-TIO might relax this restriction by making use of TMPM and the
PSMASH_IO command to split/"smash" RMP entries and IOMMU mappings to 4K
after-the-fact, but I'm not too familiar with the architecture/plans so
Alexey can correct me on that.

-Mike

> 
> > > This was the whole motivation to adding the page size override kernel
> > > command line.
> 
> commit f0295913c4b4f377c454e06f50c1a04f2f80d9df
> Author: Joerg Roedel <jroedel@xxxxxxx>
> Date:   Thu Sep 5 09:22:40 2024 +0200
> 
>     iommu/amd: Add kernel parameters to limit V1 page-sizes
>     
>     Add two new kernel command line parameters to limit the page-sizes
>     used for v1 page-tables:
>     
>             nohugepages     - Limits page-sizes to 4KiB
>     
>             v2_pgsizes_only - Limits page-sizes to 4Kib/2Mib/1GiB; The
>                               same as the sizes used with v2 page-tables
>     
>     This is needed for multiple scenarios. When assigning devices to
>     SEV-SNP guests the IOMMU page-sizes need to match the sizes in the RMP
>     table, otherwise the device will not be able to access all shared
>     memory.
>     
>     Also, some ATS devices do not work properly with arbitrary IO
>     page-sizes as supported by AMD-Vi, so limiting the sizes used by the
>     driver is a suitable workaround.
>     
>     All-in-all, these parameters are only workarounds until the IOMMU core
>     and related APIs gather the ability to negotiate the page-sizes in a
>     better way.
>     
>     Signed-off-by: Joerg Roedel <jroedel@xxxxxxx>
>     Reviewed-by: Vasant Hegde <vasant.hegde@xxxxxxx>
>     Link: https://lore.kernel.org/r/20240905072240.253313-1-joro@xxxxxxxxxx
> 
> Jason




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux