Re: [PATCH v2 06/19] mm/pagewalk: Check pfnmap for folio_walk_start()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 29, 2024 at 08:35:49AM +0200, David Hildenbrand wrote:
> Fortunately, we did an excellent job at documenting vm_normal_page():
> 
>  * There are 2 broad cases. Firstly, an architecture may define a pte_special()
>  * pte bit, in which case this function is trivial. Secondly, an architecture
>  * may not have a spare pte bit, which requires a more complicated scheme,
>  * described below.
>  *
>  * A raw VM_PFNMAP mapping (ie. one that is not COWed) is always considered a
>  * special mapping (even if there are underlying and valid "struct pages").
>  * COWed pages of a VM_PFNMAP are always normal.
>  *
>  * The way we recognize COWed pages within VM_PFNMAP mappings is through the
>  * rules set up by "remap_pfn_range()": the vma will have the VM_PFNMAP bit
>  * set, and the vm_pgoff will point to the first PFN mapped: thus every special
>  * mapping will always honor the rule
>  *
>  *	pfn_of_page == vma->vm_pgoff + ((addr - vma->vm_start) >> PAGE_SHIFT)
>  *
>  * And for normal mappings this is false.
>  *
> 
> remap_pfn_range_notrack() will currently handle that for us:
> 
> if (is_cow_mapping(vma->vm_flags)) {
> 	if (addr != vma->vm_start || end != vma->vm_end)
> 		return -EINVAL;
> }
> 
> Even if [1] would succeed, the is_cow_mapping() check will return NULL and it will
> all work as expected, even without pte_special().

IMHO referencing vm_pgoff is ambiguous, and could be wrong, if without a
clear contract.

For example, consider when the driver setup a MAP_PRIVATE + VM_PFNMAP vma,
vm_pgoff to be not the "base PFN" but some random value, then for a COWed
page it's possible the calculation accidentally satisfies "pfn ==
vma->vm_pgoff + off".  Then it could wrongly return NULL rather than the
COWed anonymous page here.  This is extremely unlikely, but just to show
why it's wrong to reference it at all.

> 
> Because VM_PFNMAP is easy: in a !COW mapping, everything is special.

Yes it's safe for vfio-pci, as vfio-pci doesn't have private mappings.  But
still, I don't think it's clear enough now on how VM_PFNMAP should be
mapped.

-- 
Peter Xu





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux