On Tue, Jun 7, 2022 at 2:07 PM Kohei Tarumizu <tarumizu.kohei@xxxxxxxxxxx> wrote: > This patch series add sysfs interface to control CPU's hardware > prefetch behavior for performance tuning from userspace for the > processor A64FX and x86 (on supported CPU). OK > A64FX and some Intel processors have implementation-dependent register > for controlling CPU's hardware prefetch behavior. A64FX has > IMP_PF_STREAM_DETECT_CTRL_EL0[1], and Intel processors have MSR 0x1a4 > (MSR_MISC_FEATURE_CONTROL)[2]. Hardware prefetch (I guess of memory contents) is a memory hierarchy feature. Linux has a memory hierarchy manager, conveniently named "mm", developed by some of the smartest people I know. The main problem addressed by that is paging, but prefetching into the CPU from the next lowest level in the memory hierarchy is just another memory hierarchy hardware feature, such as hard disks, primary RAM etc. > These registers cannot be accessed from userspace. Good. The kernel managed hardware. If the memory hierarchy people have userspace now doing stuff behind their back, through some special interface, that makes their world more complicated. This looks like it needs information from the generic memory manager, from the scheduler, and possibly all the way down from the block layer to do the right thing, so it has no business in userspace. Have you seen mm/damon for example? Access to statistics for memory access patterns seems really useful for tuning the behaviour of this hardware. Just my €0.01. If it does interact with userspace I suppose it should be using control groups, like everything else of this type, see e.g. mm/memcontrol.c, not custom sysfs files. Just an example from one of the patches: + - "* Adjacent Cache Line Prefetcher Disable (R/W)" + corresponds to the "adjacent_cache_line_prefetcher_enable" I might only be on "a little knowledge is dangerous" on the memory manager topics, but I know for sure that they at times adjust the members of structs to fit nicely on cache lines. And now this? It looks really useful for kernel machinery that know very well what needs to go into the cache line next and when. Talk to the people on linux-mm and memory maintainer Andrew Morton on how to do this right, it's a really interesting feature! Also given that people say that the memory hierarchy is an important part in the performance of the Apple M1 (M2) silicon, I expect that machine to have this too? Yours, Linus Walleij