Hugh, On Fri, May 14, 2021 at 12:07:38AM -0700, Hugh Dickins wrote: > On Wed, 12 May 2021, Peter Xu wrote: > > On Tue, Apr 27, 2021 at 12:12:53PM -0400, Peter Xu wrote: > > > This is v2 of uffd-wp shmem & hugetlbfs support, which completes uffd-wp as a > > > kernel full feature, as it only supports anonymous before this series. > > > > Ping.. > > > > Thinking about a repost, as this series shouldn't be able to apply after we've > > got more relevant patches into -mm. E.g., the full minor fault, and also some > > small stuff like pagemap, as we need one more patch to support shmem/hugetlbfs > > too. > > > > Hugh, haven't received any further comment from you on shmem side (or on the > > general idea). It would be great to still have some of your input. > > > > Let me know if you prefer to read a fresh new version otherwise. > > I am very sorry to let you down, Peter, repeatedly; but it is now very > clear that I shall *never* have time to review your patchset - I am too > slow, have too much else to attend to, and take too long each time to > sink myself deep enough into userfaultfd. Never mind! It's just that I'm kind of obliged to ask for your opinion as you contributed part of the idea while you are also the shmem maintainer. :) So that's what I did before I start to bother Andrew (since I know Andrew is 100% busy.. that's also why I tend to not ask Andrew for review pings as best as I can for all my works; while Andrew can chim in anytime anyways as in the loop). > > I realize that you're being considerate, and expecting no more than > a few comments from me, rather than asking for formal review; but it's > still too much for me to get into. I'm actually even be prepared to receive a full-series NACK anytime. :) To me it's more important to have the right direction first, as I didn't receive that during RFC so I moved on, assuming no one thinks it wrong. However it's indeed true that you never let me down (as far as I see from the other discussions) that you do very in-depth review to hunt down any single potential risks you may have noticed even in an rare error path - that's just too attractive a reviewer to all the patch writters! > > The only reason I was involved at all, was when you were wondering how > to handle the pagetable entries for shmem. I suggested one encoding, > Andrea suggested slightly differently: Andrea's was more elegant (no > "swap type" required), and it looked like you went with his - good. > > I wonder whether you noticed > https://lore.kernel.org/linux-mm/20210407084238.20443-2-apopple@xxxxxxxxxx/ > which might interfere. I've had no more time to look at that than yours, > so no opinion on it (and I don't know what happened to it after that). Thanks for the pointer. Looks like there'll be some slight rebase work and totally orthogonal on the ideas, then we'll see who will do the rebase (yeh probably me :). Hmm, meanwhile if that's the initial versions I might go and suggest a renaming of pfn_swap_entry_to_page() to start with pte_swp_*() as it operates on swp pte not a pfn. However probably too late for a v8 series so I'll give up. It also has mentioned something like "special swap pte", hope that won't get confused with what this series is proposing. We'll see when it becomes a problem, so far seems still okay. > > Please keep uppermost in mind when modifying mm/shmem.c for userfaultfd, > the difference between shared and private; and be on guard against the > ways in which CONFIG_USERFAULTFD=y might open a door to abuse. Will do. Then I'll move this series on. Re shared/private, let me mention one thing just in case for any use of peace of mind: the most dangerous place for uffd-wp+shmem should be the UFFDIO_WRITEPROTECT page resolving ioctl when we want to re-grant the write bit to ptes if needed (for minor mode, the danger point is UFFDIO_CONTINUE instead), however it should be even safer than UFFDIO_CONTINUE as UFFDIO_WRITEPROTECT never grants the write bit for real but leave that all to page fault handler (in change_pte_range()): } else if (uffd_wp_resolve) { /* * Leave the write bit to be handled * by PF interrupt handler, then * things like COW could be properly * handled. */ ptent = pte_clear_uffd_wp(ptent); } While the newprot will never have the write bit either afaik, mwriteprotect_range(): newprot = vm_get_page_prot(dst_vma->vm_flags); The last risk is the dirty_accountable trick in change_pte_range(), but as you analyzed in the other thread, userfaultfd never uses MM_CP_DIRTY_ACCT, so it should be safe too. Thanks, -- Peter Xu