Re: [PATCH v4 1/8] mm: make PTE_MARKER_SWAPIN_ERROR more general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri,  7 Jul 2023 14:55:33 -0700 Axel Rasmussen <axelrasmussen@xxxxxxxxxx> wrote:

> Future patches will re-use PTE_MARKER_SWAPIN_ERROR to implement
> UFFDIO_POISON, so make some various preparations for that:
> 
> First, rename it to just PTE_MARKER_POISONED. The "SWAPIN" can be
> confusing since we're going to re-use it for something not really
> related to swap. This can be particularly confusing for things like
> hugetlbfs, which doesn't support swap whatsoever. Also rename some
> various helper functions.
> 
> Next, fix pte marker copying for hugetlbfs. Previously, it would WARN on
> seeing a PTE_MARKER_SWAPIN_ERROR, since hugetlbfs doesn't support swap.
> But, since we're going to re-use it, we want it to go ahead and copy it
> just like non-hugetlbfs memory does today. Since the code to do this is
> more complicated now, pull it out into a helper which can be re-used in
> both places. While we're at it, also make it slightly more explicit in
> its handling of e.g. uffd wp markers.
> 
> For non-hugetlbfs page faults, instead of returning VM_FAULT_SIGBUS for
> an error entry, return VM_FAULT_HWPOISON. For most cases this change
> doesn't matter, e.g. a userspace program would receive a SIGBUS either
> way. But for UFFDIO_POISON, this change will let KVM guests get an MCE
> out of the box, instead of giving a SIGBUS to the hypervisor and
> requiring it to somehow inject an MCE.
> 
> Finally, for hugetlbfs faults, handle PTE_MARKER_POISONED, and return
> VM_FAULT_HWPOISON_LARGE in such cases. Note that this can't happen today
> because the lack of swap support means we'll never end up with such a
> PTE anyway, but this behavior will be needed once such entries *can*
> show up via UFFDIO_POISON.
> 
> --- a/include/linux/mm_inline.h
> +++ b/include/linux/mm_inline.h
> @@ -523,6 +523,25 @@ static inline bool mm_tlb_flush_nested(struct mm_struct *mm)
>  	return atomic_read(&mm->tlb_flush_pending) > 1;
>  }
>  
> +/*
> + * Computes the pte marker to copy from the given source entry into dst_vma.
> + * If no marker should be copied, returns 0.
> + * The caller should insert a new pte created with make_pte_marker().
> + */
> +static inline pte_marker copy_pte_marker(
> +		swp_entry_t entry, struct vm_area_struct *dst_vma)
> +{
> +	pte_marker srcm = pte_marker_get(entry);
> +	/* Always copy error entries. */
> +	pte_marker dstm = srcm & PTE_MARKER_POISONED;
> +
> +	/* Only copy PTE markers if UFFD register matches. */
> +	if ((srcm & PTE_MARKER_UFFD_WP) && userfaultfd_wp(dst_vma))
> +		dstm |= PTE_MARKER_UFFD_WP;
> +
> +	return dstm;
> +}

Breaks the build with CONFIG_MMU=n (arm allnoconfig).  pte_marker isn't
defined.

I'll slap #ifdef CONFIG_MMU around this function, but probably somethng more
fine-grained could be used, like CONFIG_PTE_MARKER_UFFD_WP.  Please
consider.

btw, both copy_pte_marker() and pte_install_uffd_wp_if_needed() look
far too large to justify inlining.  Please review the desirability of
this.





[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux