On 07/07/2023 20:26, Matthew Wilcox wrote: > On Fri, Jul 07, 2023 at 09:15:02PM +0200, David Hildenbrand wrote: >>>> Sure, any time we PTE-map a THP we might just say "let's put that on the >>>> deferred split queue" and cross fingers that we can eventually split it >>>> later. (I was recently thinking about that in the context of the mapcount >>>> ...) >>>> >>>> It's all a big mess ... >>> >>> Oh, I agree, there are always going to be circumstances where we realise >>> we've made a bad decision and can't (easily) undo it. Unless we have a >>> per-page pincount, and I Would Rather Not Do That. >> >> I agree ... >> >> But we should _try_ >>> to do that because it's the right model -- that's what I meant by "Tell >> >> Try to have per-page pincounts? :/ or do you mean, try to split on VMA >> split? I hope the latter (although I'm not sure about performance) :) > > Sorry, try to split a folio on VMA split. > >>> me why I'm wrong"; what scenarios do we have where a user temporarilly >>> mlocks (or mprotects or ...) a range of memory, but wants that memory >>> to be aged in the LRU exactly the same way as the adjacent memory that >>> wasn't mprotected? >> >> Let me throw in a "fun one". >> >> Parent process has a 2 MiB range populated by a THP. fork() a child process. >> Child process mprotects half the VMA. >> >> Should we split the (COW-shared) THP? Or should we COW/unshare in the child >> process (ugh!) during the VMA split. >> >> It all makes my brain hurt. > > OK, so this goes back to what I wrote earlier about attempting to choose > what size of folio to allocate on COW: > > https://lore.kernel.org/linux-mm/Y%2FU8bQd15aUO97vS@xxxxxxxxxxxxxxxxxxxx/ > > : the parent had already established > : an appropriate size folio to use for this VMA before calling fork(). > : Whether it is the parent or the child causing the COW, it should probably > : inherit that choice and we should default to the same size folio that > : was already found. FWIW, I had patches in my original RFC that aimed to follow this policy for large anon folios [1] & [2], and intend to follow up with a modified version of these patches once we have an initial submission. [1] https://lore.kernel.org/linux-mm/20230414130303.2345383-11-ryan.roberts@xxxxxxx/ [2] https://lore.kernel.org/linux-mm/20230414130303.2345383-15-ryan.roberts@xxxxxxx/