On Fri, 5 Apr 2013, Minchan Kim wrote: > > >> How about add a knob? > > > > > >Maybe, volunteering? > > > > Hi Minchan, > > > > I can be the volunteer, what I care is if add a knob make sense? > > Frankly sepaking, I'd like to avoid new knob but there might be > some workloads suffered from mlocked page migration so we coudn't > dismiss it. In such case, introducing the knob would be a solution > with default enabling. If we don't have any report for a long time, > we can remove the knob someday, IMHO. No Knob please. A new implementation for page pinning that avoids the mlock crap. 1. It should be available for device drivers to pin their memory (they are now elevating the ref counter which means page migration will have to see if it can account for all references before giving up and it does that quite frequently). So there needs to be an in kernel API, a syscall API as well as a command line one. Preferably as similar as possible. 2. A sane API for marking pages as mlocked. Maybe part of MMAP? I hate the command line tools and the APIs for doing that right now. 3. The reservation scheme for mlock via ulimit is broken. We have per process constraints only it seems. If you start enough processes you can still make the kernel go OOM. 4. mlock semantics are prescribed by posix which states that the page stays in memory. I think we should stay with that narrow definition for mlock. 5. Pinning could also mean that page faults on the page are to be avoided. COW could occur on fork and page table entries could be instantated at mmap/fork time. Pinning could mean that minor/major faults will not occur on a page. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html