Re: [PATCH v4 0/3] nvdimm: Enable sync-dax property for nvdimm

"Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxx> · Thu, 29 Apr 2021 22:02:23 +0530

On 4/29/21 9:25 PM, Stefan Hajnoczi wrote:
On Wed, Apr 28, 2021 at 11:48:21PM -0400, Shivaprasad G Bhat wrote:
The nvdimm devices are expected to ensure write persistence during power
failure kind of scenarios.

The libpmem has architecture specific instructions like dcbf on POWER
to flush the cache data to backend nvdimm device during normal writes
followed by explicit flushes if the backend devices are not synchronous
DAX capable.

Qemu - virtual nvdimm devices are memory mapped. The dcbf in the guest
and the subsequent flush doesn't traslate to actual flush to the backend
file on the host in case of file backed v-nvdimms. This is addressed by
virtio-pmem in case of x86_64 by making explicit flushes translating to
fsync at qemu.

On SPAPR, the issue is addressed by adding a new hcall to
request for an explicit flush from the guest ndctl driver when the backend
nvdimm cannot ensure write persistence with dcbf alone. So, the approach
here is to convey when the hcall flush is required in a device tree
property. The guest makes the hcall when the property is found, instead
of relying on dcbf.

Sorry, I'm not very familiar with SPAPR. Why add a hypercall when the
virtio-nvdimm device already exists?

On virtualized ppc64 platforms, guests use papr_scm.ko kernel drive for 
persistent memory support. This was done such that we can use one kernel 
driver to support persistent memory with multiple hypervisors. To avoid 
supporting multiple drivers in the guest, -device nvdimm Qemu 
command-line results in Qemu using PAPR SCM backend. What this patch 
series does is to make sure we expose the correct synchronous fault 
support, when we back such nvdimm device with a file.

The existing PAPR SCM backend enables persistent memory support with the 
help of multiple hypercall.

#define H_SCM_READ_METADATA     0x3E4
#define H_SCM_WRITE_METADATA    0x3E8
#define H_SCM_BIND_MEM          0x3EC
#define H_SCM_UNBIND_MEM        0x3F0
#define H_SCM_UNBIND_ALL        0x3FC

Most of them are already implemented in Qemu. This patch series 
implements H_SCM_FLUSH hypercall.

-aneesh