On Fri, 2017-11-03 at 16:30 -0400, nilal@xxxxxxxxxx wrote: > The following patch-set proposes an efficient mechanism for handing > freed memory between the guest and the host. It enables the guests > with DAX (no page cache) to rapidly free and reclaim memory to and > from the host respectively. > Performance: > Test criteria: Kernel Build > Command: make clean;make defconfig;time make > With Hinting: > real: 21m24.680s > user: 16m3.362s > sys : 2m19.027s > Without Hinting: > real: 21m18.062s > user: 16m13.969s > sys : 1m17.884s > > Test criteria: Stress Test > Command: time stress --io 2 --cpu 2 --vm 2 --vm-bytes 1024M -- > timeout 100s -v > With Hinting: > real: 1m40.726s > user: 1m23.449s > sys : 0m5.576s > Without Hinting: > real: 1m40.378s > user: 1m21.292s > sys : 0m4.972s These numbers look really good, but these workloads are mostly in user space. Could you also try with more kernel heavy workloads, like netperf (sender and receiver on the same CPU, vs sender and receiver on different CPUs) and hackbench?