On 12/9/20 3:15 PM, Jason Gunthorpe wrote: > On Wed, Dec 09, 2020 at 11:05:39AM +0000, Joao Martins wrote: >>> Why is all of this special? Any time we see a PMD/PGD/etc pointing to >>> PFN we can apply this optimization. How come device has its own >>> special path to do this?? >> >> I think the reason is that zone_device struct pages have no >> relationship to one other. So you anyways need to change individual >> pages, as opposed to just the head page. > > Huh? That can't be, unpin doesn't know the memory type when it unpins > it, and as your series shows unpin always operates on the compound > head. Thus pinning must also operate on compound heads > I was referring to the code without this series, in the paragraph above. Meaning today zone_device pages are *not* represented compound pages. And so compound_head(page) on a non compound page just returns the page itself. Otherwise, try_grab_page() (e.g. when pinning pages) would be broken. >> I made it special to avoid breaking other ZONE_DEVICE users (and >> gating that with PGMAP_COMPOUND). But if there's no concerns with >> that, I can unilaterally enable it. > > I didn't understand what PGMAP_COMPOUND was supposed to be for.. > PGMAP_COMPOUND purpose is to online these pages as compound pages (so head and tails). Today (without the series) struct pages are not represented the way they are expressed in the page tables, which is what I am hoping to fix in this series thus initializing these as compound pages of a given order. But me introducing PGMAP_COMPOUND was to conservatively keep both old (non-compound) and new (compound pages) co-exist. I wasn't sure I could just enable regardless, worried that I would be breaking other ZONE_DEVICE/memremap_pages users. >>> Why do we need to check PGMAP_COMPOUND? Why do we need to get pgmap? >>> (We already removed that from the hmm version of this, was that wrong? >>> Is this different?) Dan? > > And this is the key question - why do we need to get a pgmap here? > > I'm going to assert that a pgmap cannot be destroyed concurrently with > fast gup running. This is surely true on x86 as the TLB flush that > must have preceeded a pgmap destroy excludes fast gup. Other arches > must emulate this in their pgmap implementations. > > So, why do we need pgmap here? Hoping Dan might know > > If we delete the pgmap then the devmap stop being special. > I will let Dan chip in. > CH and I looked at this and deleted it from the hmm side: > > commit 068354ade5dd9e2b07d9b0c57055a681db6f4e37 > Author: Jason Gunthorpe <jgg@xxxxxxxx> > Date: Fri Mar 27 17:00:13 2020 -0300 > > mm/hmm: remove pgmap checking for devmap pages > > The checking boils down to some racy check if the pagemap is still > available or not. Instead of checking this, rely entirely on the > notifiers, if a pagemap is destroyed then all pages that belong to it must > be removed from the tables and the notifiers triggered. > > Link: https://lore.kernel.org/r/20200327200021.29372-2-jgg@xxxxxxxx > > Though I am wondering if this whole hmm thing is racy with memory > unplug. Hmm. Joao