"Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxx> writes: > Vaibhav Jain <vaibhav@xxxxxxxxxxxxx> writes: > >> Hi Shiva, >> >> Apologies for a late review but something just caught my eye while >> working on a different patch. >> >> Shivaprasad G Bhat <sbhat@xxxxxxxxxxxxx> writes: >> >>> Add support for ND_REGION_ASYNC capability if the device tree >>> indicates 'ibm,hcall-flush-required' property in the NVDIMM node. >>> Flush is done by issuing H_SCM_FLUSH hcall to the hypervisor. >>> >>> If the flush request failed, the hypervisor is expected to >>> to reflect the problem in the subsequent nvdimm H_SCM_HEALTH call. >>> >>> This patch prevents mmap of namespaces with MAP_SYNC flag if the >>> nvdimm requires an explicit flush[1]. >>> >>> References: >>> [1] https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/ndctl.py.data/map_sync.c >>> >>> Signed-off-by: Shivaprasad G Bhat <sbhat@xxxxxxxxxxxxx> >>> --- >>> v2 - https://www.spinics.net/lists/kvm-ppc/msg18799.html >>> Changes from v2: >>> - Fixed the commit message. >>> - Add dev_dbg before the H_SCM_FLUSH hcall >>> >>> v1 - https://www.spinics.net/lists/kvm-ppc/msg18272.html >>> Changes from v1: >>> - Hcall semantics finalized, all changes are to accomodate them. >>> >>> Documentation/powerpc/papr_hcalls.rst | 14 ++++++++++ >>> arch/powerpc/include/asm/hvcall.h | 3 +- >>> arch/powerpc/platforms/pseries/papr_scm.c | 40 +++++++++++++++++++++++++++++ >>> 3 files changed, 56 insertions(+), 1 deletion(-) >>> >>> diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst >>> index 48fcf1255a33..648f278eea8f 100644 >>> --- a/Documentation/powerpc/papr_hcalls.rst >>> +++ b/Documentation/powerpc/papr_hcalls.rst >>> @@ -275,6 +275,20 @@ Health Bitmap Flags: >>> Given a DRC Index collect the performance statistics for NVDIMM and copy them >>> to the resultBuffer. >>> >>> +**H_SCM_FLUSH** >>> + >>> +| Input: *drcIndex, continue-token* >>> +| Out: *continue-token* >>> +| Return Value: *H_SUCCESS, H_Parameter, H_P2, H_BUSY* >>> + >>> +Given a DRC Index Flush the data to backend NVDIMM device. >>> + >>> +The hcall returns H_BUSY when the flush takes longer time and the hcall needs >>> +to be issued multiple times in order to be completely serviced. The >>> +*continue-token* from the output to be passed in the argument list of >>> +subsequent hcalls to the hypervisor until the hcall is completely serviced >>> +at which point H_SUCCESS or other error is returned by the hypervisor. >>> + >> Does the hcall semantic also include measures to trigger a barrier/fence >> (like pm_wmb()) so that all the stores before the hcall are gauranteed >> to have hit the pmem controller ? > > It is upto the hypervisor to implement the right sequence to ensure the > guarantee. The hcall doesn't expect the store to reach the platform > buffers. Since the asyc_flush function is also called for performing deep_flush from libnvdimm and as the hcall doesnt gaurantee stores to reach the platform buffers, hence papr_scm_pmem_flush() should issue pm_wmb() before the hcall. This would ensure papr_scm_pmem_flush() semantics are consistent that to generic_nvdimm_flush(). Also, adding the statement "The hcall doesn't expect the store to reach the platform buffers" to the hcall documentation would be good to have. > > >> >> If not then the papr_scm_pmem_flush() introduced below should issue a >> fence like pm_wmb() before issuing the flush hcall. >> >> If yes then this behaviour should also be documented with the hcall >> semantics above. >> > > IIUC fdatasync on the backend file is enough to ensure the > guaraantee. Such a request will have REQ_PREFLUSH and REQ_FUA which will > do the necessary barrier on the backing device holding the backend file. > Right, but thats assuming nvdimm is backed by a regular file in hypervisor which may not always be the case. > -aneesh -- Cheers ~ Vaibhav