Re: [PATCH v1 2/5] userfaultfd: introduce access-likely mode for common operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 27, 2022 at 03:27:49PM +0200, David Hildenbrand wrote:
> > Fundamentally, access bit has more meaningful context (0 means cold, 1
> > means hot), for dirty it's really more a perf thing to me (when clear,
> > it'll take extra cycles to set it when memory write happens to it; being
> > clear _may_ help only for the tlb flush example you mentioned but I'm not
> > fully convinced that's correct).
> > 
> > Maybe with the to be proposed RFC patch for tlb flush we can know whether
> > that should be something we can rely on.  It'll add more dependency on this
> > work which I'm sorry to say.  It's just that IMHO we should think carefully
> > for the write-hint because this is a solid new uABI we're talking about.
> > 
> > The other option is we can introduce the access hint first and think more
> > on the dirty one (we can always add it when proper).  What do you think?
> > Also, David please chim in anytime if I missed the whole point when you
> > proposed the idea.
> 
> Well, if we have an ABI that places pages without further information
> *why* we're doing that makes us having to guess what to do or what not
> to do, and I think the zeropage placement is a prime example for that.
> Personally, I think communicating the intention in forms of hints is
> something that doesn't leak implementation details into an ABI.
> 
> So "no planned access" vs. "read_likely" vs. "write_likely" conceptually
> makes sense to me.
> 
> As I raised previously, *if* we want to let the user affect the dirty
> bit setting (1) is then a pure implementation detail. Or whatever else
> we might want to do.
> 
> But I also want to raise awareness that architectures that don't have a
> hw-set dirty bit have to use page faults to mimic dirty tracking. IIRC,
> s390x is a prime example for that: pte_mkclean() sets the WP bit and
> marks the page dirty from the write fault. So it's even more expensive
> than on other architectures.

The last input seems to be supporting that we'd better even have redundant
dirty bit in ptes rather than accidentally not having it, even when both
are safe.

So to me WRITE_LIKELY was still mostly around dirty bit besides the
ZEROPAGE case.  I don't have a strong opinion on how we should name that
flag, if we want to insist on WRITE_LIKELY but only on ZEROPAGE I think
it's fine, it's just that if I'm the user app I prefer making sure the page
is allocated after UFFDIO_ZEROPAGE returned, rather than only providing a
hint and then the kernels says "we'll do something but nothing is
guaranteed".

I also fully agree we don't want to expose impl details but my question was
more on whether we want that hint at all as a generic one, and in what case
that hint helps outside ZEROPAGE.  For "it can be accessed" hint, I have an
answer and it seems to apply to most of the uffd ioctls; but not so generic
for a "it can be written" hint.

Thanks,

-- 
Peter Xu





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux