Re: [PATCH v1 2/5] userfaultfd: introduce access-likely mode for common operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 23, 2022 at 11:35:00PM +0000, Nadav Amit wrote:
> On Jun 23, 2022, at 4:24 PM, Peter Xu <peterx@xxxxxxxxxx> wrote:
> 
> > ⚠ External Email
> > 
> > On Wed, Jun 22, 2022 at 11:50:35AM -0700, Nadav Amit wrote:
> >> From: Nadav Amit <namit@xxxxxxxxxx>
> >> 
> >> Using a PTE on x86 with cleared access-bit (aka young-bit)
> >> takes ~600 cycles more than when the access bit is set. At the same
> >> time, setting the access-bit for memory that is not used (e.g.,
> >> prefetched) can introduce greater overheads, as the prefetched memory is
> >> reclaimed later than it should be.
> >> 
> >> Userfaultfd currently does not set the access-bit (excluding the
> >> huge-pages case). Arguably, it is best to let the user control whether
> >> the access bit should be set or not. The expected use is to request
> >> userfaultfd to set the access-bit when the copy/wp operation is done to
> >> resolve a page-fault, and not to set the access-bit when the memory is
> >> prefetched.
> >> 
> >> Introduce UFFDIO_[op]_ACCESS_LIKELY to enable userspace to request the
> >> young bit to be set.
> >> 
> >> Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
> >> Cc: Hugh Dickins <hughd@xxxxxxxxxx>
> >> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> >> Cc: Axel Rasmussen <axelrasmussen@xxxxxxxxxx>
> >> Cc: Peter Xu <peterx@xxxxxxxxxx>
> >> Cc: David Hildenbrand <david@xxxxxxxxxx>
> >> Cc: Mike Rapoport <rppt@xxxxxxxxxxxxx>
> >> Signed-off-by: Nadav Amit <namit@xxxxxxxxxx>
> > 
> > Hmm.. is the hugetlb code overlooked (for both of the hints), or maybe I
> > missed it? Do we need to cover them too? Thanks,
> 
> I do not know what the value of not setting the PTE’s access/dirty when it
> comes to performance. The pages won’t be swapped out, just as you wrote in
> your comment in hugetlb_mcopy_atomic_pte():
> 
>         /*
>          * Always mark UFFDIO_COPY page dirty; note that this may not be
>          * extremely important for hugetlbfs for now since swapping is not
>          * supported, but we should still be clear in that this page cannot be
>          * thrown away at will, even if write bit not set.
>          */
>         _dst_pte = huge_pte_mkdirty(_dst_pte);
>         _dst_pte = pte_mkyoung(_dst_pte);
> 
> If you want for consistency/robustness not to set dirty on read-only
> entries, that’s something that I can do.

We have more than one options here I guess.

We could have the hints not available for hugetlbfs, but then we'll both
need to add special document for hugetlbfs (when you write the man-page,
let's say) and also it'll be probably better to fail the ioctls with hints
when applying to hugetlb vmas.

Leaving them silent to hugetlbfs is another option, it just sounds weird to
me without any kind of documentation so I like this one least.

Or we could make them work for hugetlbfs too.  Not saying that swap will
work some day (but it still could?!) it just seems all consistent as you
said.

So I'd prefer the last one, but only if you agree.

-- 
Peter Xu





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux