[...] >> I'm bad at naming, UFFDIO_COPY_MODE_ACCESS_LIKELY would express what I >> have in mind. > > How about UFFDIO_COPY_MODE_WILLNEED_READ ? Would work for me. > >> >>> Introduce UFFDIO_COPY_MODE_YOUNG to enable userspace to request the >>> young bit to be set. For UFFDIO_CONTINUE and UFFDIO_ZEROPAGE set the bit >>> unconditionally since the former is only used to resolve page-faults and >>> the latter would not benefit from not setting the access-bit. >>> >>> Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> >>> Cc: Hugh Dickins <hughd@xxxxxxxxxx> >>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >>> Cc: Axel Rasmussen <axelrasmussen@xxxxxxxxxx> >>> Cc: Peter Xu <peterx@xxxxxxxxxx> >>> Cc: David Hildenbrand <david@xxxxxxxxxx> >>> Cc: Mike Rapoport <rppt@xxxxxxxxxxxxx> >>> Signed-off-by: Nadav Amit <namit@xxxxxxxxxx> >>> >>> --- >>> >>> There are 2 possible enhancements: >>> >>> 1. Use the flag to decide on whether to mark the PTE as dirty (for >>> writable PTEs). I guess that setting the dirty-bit is as expensive as >>> setting the access-bit, and setting it introduces similar tradeoffs, >>> as mentioned above. >>> >>> 2. Introduce a similar mode for write-protect and use this information >>> for setting both the young and dirty bits. Makes one wonder whether >>> mprotect() should also set the bit in certain cases... >> >> I wonder if UFFDIO_COPY_MODE_READ_ACCESS_LIKELY vs. >> UFFDIO_COPY_WRITE_ACCESS_LIKELY could evenmake sense. I feel like it could. >> >> For example, QEMU knows if a page fault it's resolving was due to a read >> or a write fault and could use that information accordingly. Of course, >> we don't completely know if we currently have a read fault, if we could >> get a write fault immediately after. >> >> Especially in the context of UFFDIO_ZEROPAGE, >> UFFDIO_ZEROPAGE_WRITE_ACCESS_LIKELY could ... not place the zeropage but >> instead populate an actual page and mark it accessed+dirty. I even have >> a use case for that ;) >> >> >> The kernel could decide how to treat these hints -- for example, if it >> doesn't want user space to mess with access/dirty bits, it could just >> mostly ignore the hints. > > I can do that. I think users can do the zero page-copy themselves today, but > whatever you prefer. Just so we're on the same page and I'm not missing some smart way: it would have to provide a zeroed buffer in user space, and trigger the copy via UFFDIO_COPY. Instead, the kernel can simply clear the user page directly when allocating from the buddy, instead of eventually zeroing it and then copying from a zeroed user buffer / zeropage. > > But, I cannot take it anymore: the list of arguments for uffd stuff is > crazy. I would like to collect all the possible arguments that are used for > uffd operation into some “struct uffd_op”. > > Any objection? Not from my side, as long as it doesn't break uapi. -- Thanks, David / dhildenb