Re: [RFC PATCH 0/4] Allow persistent data on DAX device being used as KMEM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02.08.22 19:57, Srinivas Aji wrote:
> Linux supports adding a DAX driver managed memory region as system
> memory using the KMEM driver (from version 5.1). We would like to use
> a persistent addressable memory segment as system memory and
> simultaneously for storing some persistent data.
> 
> Motivation: It is already possible to partition an NVDIMM device for
> RAM and storage by creating separate regions on the device and using
> one of them with KMEM and another as fsdax. This patch set is a start
> to trying to get zero copy snapshots of processes which are using the
> DAX device as RAM. That requires dynamically sharing pages between
> process RAM and the storage within a single NVDIMM region.
> 
> To do this, we add a layer for handling the persistent data which does
> the following:
> 
> 1. When a DAX device is added as KMEM, mark all the memory as
>    allocated and pass it up to a module which is aware of the storage
>    layout.
> 
> 2. This module scans the memory, identifies the unused parts, and
>    frees those memory pages.
> 
> 3. Further memory from this device is allocated using the kernel
>    memory allocation API. The memory allocation API currently allows
>    the allocation to be limited only based on NUMA node. So this
>    feature works only when the DAX device used as KMEM is the only
>    memory from its NUMA node.
> 
> 4. Discarding of blocks previously used for persistent data results in
>    those blocks being freed to system memory.
> 
> As an example, we implement a simple persistence module using the
> above framework to provide a block device. A block device assumes all
> blocks are always available, but in this case we have to get the
> blocks through the memory allocation API, at an offset not under our
> control. To provide block device semantics, we maintain an array which
> maps the logical block number to the real physical page, if one
> exists. Block device Trim/Discard support is used to mark blocks as
> unused.
> 
> While we have the block device here as an example, a memory filesystem
> might be a more useful implementation. I am not sure if any of the
> existing in-memory filesystem structures are suited for
> persistence. Any suggestions for this are appreciated.
> 
> Srinivas Aji (4):
>   mm/memory_hotplug: Add MHP_ALLOCATE flag which treats hotplugged
>     memory as allocated

Without seeing the actual patches, I am very skeptical that this is the
right approach, especially regarding memory onlining/offlining.

virtio-mem achieves something similar (yet different) by hooking into
generic_online_page(). From there, you can control what should actually
happen with memory that is getting onlined (e.g., free them to the buddy
or treat them differently).

Did you evaluate going that path?

-- 
Thanks,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux