On Fri, Aug 04, 2023 at 10:17:35AM -0400, Guo Ren wrote: > > See, this is where the ARM64 WFE would come in handy; I don't suppose > > RISC-V has anything like that? > Em... arm64 smp_cond_load only could save power consumption or release > the pipeline resources of an SMT processor. When (Node1 cpu64) is in > the WFE state, it still needs (Node0 cpu1) to write the value to give > a cross-NUMA signal. So I didn't see what WFE related to reducing > cross-Numa transactions, or I missed something. Sorry The benefit is that WFE significantly reduces the memory traffic. Since it 'suspends' the core and waits for a write-notification instead of busy polling the memory location you get a ton less loads.