Mel Gorman <mgorman@xxxxxxx> writes: > On Tue, Feb 18, 2020 at 04:26:29PM +0800, Huang, Ying wrote: >> From: Huang Ying <ying.huang@xxxxxxxxx> >> >> In a memory tiering system, if the memory size of the workloads is >> smaller than that of the faster memory (e.g. DRAM) nodes, all pages of >> the workloads should be put in the faster memory nodes. But this >> makes it unnecessary to use slower memory (e.g. PMEM) at all. >> >> So in common cases, the memory size of the workload should be larger >> than that of the faster memory nodes. And to optimize the >> performance, the hot pages should be promoted to the faster memory >> nodes while the cold pages should be demoted to the slower memory >> nodes. To achieve that, we have two choices, >> >> a. Promote the hot pages from the slower memory node to the faster >> memory node. This will create some memory pressure in the faster >> memory node, thus trigger the memory reclaiming, where the cold >> pages will be demoted to the slower memory node. >> >> b. Demote the cold pages from faster memory node to the slower memory >> node. This will create some free memory space in the faster memory >> node, and the hot pages in the slower memory node could be promoted >> to the faster memory node. >> >> The choice "a" will create the memory pressure in the faster memory >> node. If the memory pressure of the workload is high too, the memory >> pressure may become so high that the memory allocation latency of the >> workload is influenced, e.g. the direct reclaiming may be triggered. >> >> The choice "b" works much better at this aspect. If the memory >> pressure of the workload is high, it will consume the free memory and >> the hot pages promotion will stop earlier if its allocation watermark >> is higher than that of the normal memory allocation. >> >> In this patch, choice "b" is implemented. If memory tiering NUMA >> balancing mode is enabled, the node isn't the slowest node, and the >> free memory size of the node is below the high watermark, the kswapd >> of the node will be waken up to free some memory until the free memory >> size is above the high watermark + autonuma promotion rate limit. If >> the free memory size is below the high watermark, autonuma promotion >> will stop working. This avoids to create too much memory pressure to >> the system. >> >> Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx> > > Unfortunately I stopped reading at this point. It depends on another series > entirely and they really need to be presented together instead of relying > on searching mail archives to find other patches to try assemble the full > picture :(. Ideally each stage would have supporting data showing roughly > how it behaves at each major stage. I know this will be a pain but the > original NUMA balancing had the same problem and ultimately started with > one large series that got the basics right followed by other series that > improved it in stages. That process is *still* ongoing today. Sorry for inconvenience, we will post a new patchset including both series and add supporting data at each major stage when possible. Best Regards, Huang, Ying