On 8/31/2023 12:19 AM, Ankur Arora wrote:
clear_pages_rep(), clear_pages_erms() clear using string instructions. While clearing extents of more than a single page, we can use these more effectively by explicitly advertising the region-size to the processor. This can be used as a hint by the processor-uarch to optimize the clearing (ex. to avoid polluting one or more levels of the data-cache.) As a secondary benefit, string instructions are typically microcoded, and so it's a good idea to amortize the cost of the decode across larger regions. Accordingly, clear_huge_page() now does huge-page clearing in three parts: the neighbourhood of the faulting address, the left, and the right region of the neighbourhood. The local neighbourhood is cleared last to keep its cachelines hot.
[...]
Signed-off-by: Ankur Arora <ankur.a.arora@xxxxxxxxxx> --- arch/x86/mm/hugetlbpage.c | 54 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+)
Hello Ankur, Just thinking loud here (w.r.t THP). V3 patchset with uarch changes had changes in THP path too, where one could explicitly give hints or non-caching hints. and they are passed down to call incoherent clearing. IMO, those changes logically belong to uarch optimizations.. right?