> On 11/01/2017 12:25 PM, Dan Williams wrote: [..] >> It's not persistent memory if it requires a hypercall to make it >> persistent. Unless memory writes can be made durable purely with cpu >> instructions it's dangerous for it to be treated as a PMEM range. >> Consider a guest that tried to map it with device-dax which has no >> facility to route requests to a special flushing interface. >> > > Can we separate the concept of flush interface from persistent memory? > Say there are two APIs, one is used to indicate the memory type (i.e, > /proc/iomem) and another one indicates the flush interface. > > So for existing nvdimm hardwares: > 1: Persist-memory + CLFLUSH > 2: Persiste-memory + flush-hint-table (I know Intel does not use it) > > and for the virtual nvdimm which backended on normal storage: > Persist-memory + virtual flush interface I see the flush interface as fundamental to identifying the media properties. It's not byte-addressable persistent memory if the application needs to call a sideband interface to manage writes. This is why we have pushed for something like the MAP_SYNC interface to make filesystem-dax actually behave in a way that applications can safely treat it as persistent memory, and this is also the guarantee that device-dax provides. Changing the flush interface makes it distinct and unusable for applications that want to manage data persistence in userspace. >>> >>>> In what way is this "more complicated"? It was trivial to add support >>>> for the "volatile" NFIT range, this will not be any more complicated >>>> than that. >>>> >>> >>> Introducing memory type is easy indeed, however, a new flush interface >>> definition is inevitable, i.e, we need a standard way to discover the >>> MMIOs to communicate with host. >> >> >> Right, the proposed way to do that for x86 platforms is a new SPA >> Range GUID type. in the NFIT. >> > > So this SPA is used for both persistent memory region and flush interface? > Maybe i missed it in previous mails, could you please detail how to do > it? Yes, the GUID will specifically identify this range as "Virtio Shared Memory" (or whatever name survives after a bikeshed debate). The libnvdimm core then needs to grow a new region type that mostly behaves the same as a "pmem" region, but drivers/nvdimm/pmem.c grows a new flush interface to perform the host communication. Device-dax would be disallowed from attaching to this region type, or we could grow a new device-dax type that does not allow the raw device to be mapped, but allows a filesystem mounted on top to manage the flush interface. > BTW, please note hypercall is not acceptable for standard, MMIO/PIO regions > are. (Oh, yes, it depends on Paolo. :)) MMIO/PIO regions works for me, that's not the part of the proposal I'm concerned about.