Re: [PATCH v2 2/5] userfaultfd: introduce access-likely mode for common operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 18, 2022 at 04:47:45AM -0700, Nadav Amit wrote:
> @@ -261,6 +272,7 @@ struct uffdio_copy {
>  struct uffdio_zeropage {
>  	struct uffdio_range range;
>  #define UFFDIO_ZEROPAGE_MODE_DONTWAKE		((__u64)1<<0)
> +#define UFFDIO_ZEROPAGE_MODE_ACCESS_LIKELY	((__u64)1<<1)

Would access hint help zeropage use case?  I remembered you used to comment
around and said it won't help since we won't reclaim zero page anyway.

It won't help either even if this flag is only used for the follow up
WRITE_HINT (since then there'll be a CoW) because when WRITE_HINT attached
it doesn't make sense to not have ACCESS_HINT, then it seems the WRITE_HINT
itself would be enough for ZEROPAGE to me.

[...]

> diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> index 421784d26651..c15679f3eb6a 100644
> --- a/mm/userfaultfd.c
> +++ b/mm/userfaultfd.c
> @@ -65,6 +65,7 @@ int mfill_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
>  	bool writable = dst_vma->vm_flags & VM_WRITE;
>  	bool vm_shared = dst_vma->vm_flags & VM_SHARED;
>  	bool page_in_cache = page->mapping;
> +	bool prefault = !(uffd_flags & UFFD_FLAGS_ACCESS_LIKELY);

I think it's okay to name it "prefault" as a temp var, but ideally IMHO we
shouldn't assume what the user app is doing - it is only installing some
uffd pgtables with !ACCESS_LIKELY and it does not necessarily need to be a
prefault process..

>  	spinlock_t *ptl;
>  	struct inode *inode;
>  	pgoff_t offset, max_off;
> @@ -92,6 +93,11 @@ int mfill_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
>  		 */
>  		_dst_pte = pte_wrprotect(_dst_pte);
>  
> +	if (prefault && arch_wants_old_prefaulted_pte())
> +		_dst_pte = pte_mkold(_dst_pte);
> +	else
> +		_dst_pte = pte_sw_mkyoung(_dst_pte);

Could you explain why we couldn't unconditionally mkold here even for x86?

It'll be a pity if this feature bit will only be useful on arm64 but not
covering x86 (which is so far still the majority I think).

IMHO it's slightly different here comparing to kernel prefaults - the uesr
app may not be aware of kernel prefaults, but here !ACCESS_HINT it's
user-aware, and it's what user app explicitly provided.  IMO it's a
stronger proof of a cold page already.

The other thing I got confused here is arch_wants_old_prefaulted_pte()
returns true if arm64 supports hardware AF.  However for all the rest archs
(including x86_64 which, afaict, support AF too in most models) it'll
constantly return false.  Do you know what's the rational behind?

> +
>  	dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl);
>  
>  	if (vma_is_shmem(dst_vma)) {
> @@ -202,7 +208,8 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm,
>  static int mfill_zeropage_pte(struct mm_struct *dst_mm,
>  			      pmd_t *dst_pmd,
>  			      struct vm_area_struct *dst_vma,
> -			      unsigned long dst_addr)
> +			      unsigned long dst_addr,
> +			      uffd_flags_t uffd_flags)
>  {
>  	pte_t _dst_pte, *dst_pte;
>  	spinlock_t *ptl;
> @@ -495,7 +502,7 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm,
>  					       uffd_flags);
>  		else
>  			err = mfill_zeropage_pte(dst_mm, dst_pmd,
> -						 dst_vma, dst_addr);
> +						 dst_vma, dst_addr, uffd_flags);
>  	} else {
>  		err = shmem_mfill_atomic_pte(dst_mm, dst_pmd, dst_vma,
>  					     dst_addr, src_addr,
> -- 
> 2.25.1
> 

-- 
Peter Xu





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux