On Fri, Jan 13, 2017 at 09:49:14PM +0000, Michaud, Adrian wrote: > I'd like to attend and propose one or all of the following topics at this year's summit. > > Multiple Page Caches (Software Enhancements) > -------------------------- > Support for multiple page caches can provide many benefits to the kernel. > Different memory types can be put into different page caches. One page > cache for native DDR system memory, another page cache for slower > NV-DIMMs, etc. > General memory can be partitioned into several page caches of different > sizes and could also be dedicated to high priority processes or used > with containers to better isolate memory by dedicating a page cache to a > cgroup process. > Each VMA, or process, could have a page cache identifier, or page > alloc/free callbacks that allow individual VMAs or processes to specify > which page cache they want to use. > Some VMAs might want anonymous memory backed by vast amounts of slower > server class memory like NV-DIMMS. > Some processes or individual VMAs might want their own private page > cache. > Each page cache can have its own eviction policy and low-water markers > Individual page caches could also have their own swap device. Sounds like you're re-inventing NUMA. What am I missing? > Memory Tiering (Software Enhancements) > -------------------- > Using multiple page caches, evictions from one page cache could be moved > and remapped to another page cache instead of unmapped and written to > swap. > If a system has 16GB of high speed DDR memory, and 64GB of slower > memory, one could create a page cache with high speed DDR memory, > another page cache with slower 64GB memory, and evict/copy/remap from > the DDR page cache to the slow memory page cache. Evictions from the > slow memory page cache would then get unmapped and written to swap. I guess it's something that can be done as part of NUMA balancing. > Better LRU evictions (Software and Hardware Enhancements) > ------------------------- > Add a page fault counter to the page struct to help colorize page demand. > We could suggest to Intel/AMD and other architecture leaders that TLB > entries also have a translation counter (8-10 bits is sufficient) > instead of just an "accessed" bit. Scanning/clearing access bits is > obviously inefficient; however, if TLBs had a translation counter > instead of a single accessed bit then scanning and recording the amount > of activity each TLB has would be significantly better and allow us to > bettern calculate LRU pages for evictions. Except that would make memory accesses slower. Even access bit handing is noticible performance hit: processor has to write into page table entry on first access to the page. What you're proposing is making 2^8-2^10 first accesses slower. Sounds like no-go for me. > TLB Shootdown (Hardware Enhancements) > -------------------------- > We should stomp our feet and demand that TLB shootdowns should be > hardware assisted in future architectures. Current TLB shootdown on x86 > is horribly inefficient and obviously doesn't scale. The QPI/UPI local > bus protocol should provide TLB range invalidation broadcast so that a > single CPU can concurrently notify other CPU/cores (with a selection > mask) that a shared TLB entry has changed. Sending an IPI to each core > is horribly inefficient; especially with the core counts increasing and > the frequency of TLB unmapping/remapping also possibly increasing > shortly with new server class memory extension technology. IIUC, the best you can get from hardware is IPI behind the scene. I doubt it worth the effort. -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>