On Sat, Jul 08, 2023 at 11:52:23AM +0800, Yin, Fengwei wrote: > > Oh, I agree, there are always going to be circumstances where we realise > > we've made a bad decision and can't (easily) undo it. Unless we have a > > per-page pincount, and I Would Rather Not Do That. But we should _try_ > > to do that because it's the right model -- that's what I meant by "Tell > > me why I'm wrong"; what scenarios do we have where a user temporarilly > > mlocks (or mprotects or ...) a range of memory, but wants that memory > > to be aged in the LRU exactly the same way as the adjacent memory that > > wasn't mprotected? > for manpage of mlock(): > mlock(), mlock2(), and mlockall() lock part or all of the calling process's virtual address space into RAM, preventing that memory > from being paged to the swap area. > > So my understanding is it's OK to let the memory mlocked to be aged with > the adjacent memory which is not mlocked. Just make sure they are not > paged out to swap. Right, it doesn't break anything; it's just a similar problem to internal fragmentation. The pages of the folio which aren't mlocked will also be locked in RAM and never paged out. > One question for implementation detail: > If the large folio cross VMA boundary can not be split, how do we > deal with this case? Retry in syscall till it's split successfully? > Or return error (and what ERRORS should we choose) to user space? I would be tempted to allocate memory & copy to the new mlocked VMA. The old folio will go on the deferred_list and be split later, or its valid parts will be written to swap and then it can be freed.