On Mon, Oct 12, 2020 at 4:00 AM Joao Martins <joao.m.martins@xxxxxxxxxx> wrote: [..] > On 10/10/20 9:15 AM, yulei zhang wrote: > > On Fri, Oct 9, 2020 at 7:53 PM Joao Martins <joao.m.martins@xxxxxxxxxx> wrote: > >> On 10/9/20 12:39 PM, yulei zhang wrote: > >>> Joao, thanks a lot for the feedback. One more thing needs to mention > >>> is that dmemfs also support fine-grained > >>> memory management which makes it more flexible for tenants with > >>> different requirements. > >>> > >> So as DAX when it allows to partition a region (starting 5.10). Meaning you have a region > >> which you dedicated to userspace. That region can then be partitioning into devices which > >> give you access to multiple (possibly discontinuous) extents with at a given page > >> granularity (selectable when you create the device), accessed through mmap(). > >> You can then give that device to a cgroup. Or you can return that memory back to the > >> kernel (should you run into OOM situation), or you recreate the same mappings across > >> reboot/kexec. > >> > >> I probably need to read your patches again, but can you extend on the 'dmemfs also support > >> fine-grained memory management' to understand what is the gap that you mention? > >> > > sure, dmemfs uses bitmap to track the memory usage in the reserved > > memory region in > > a given page size granularity. And for each user the memory can be > > discrete as well. > > > That same functionality of tracking reserved region usage across different users at any > page granularity is covered the DAX series I mentioned below. The discrete part -- IIUC > what you meant -- is then reduced using DAX ABI/tools to create a device file vs a filesystem. Put another way. Linux already has a fine grained memory management system, the page allocator. Now, with recent device-dax extensions, it also has a coarse grained memory management system for physical address-space partitioning and a path for struct-page-less backing for VMs. What feature gaps remain vs dmemfs, and can those gaps be closed with incremental improvements to the 2 existing memory-management systems?