On 6/1/24 1:39 AM, Roman Gushchin wrote: > On Fri, May 31, 2024 at 11:33:31AM +0200, Vlastimil Babka wrote: >> Incomplete, help needed from ftrace/kprobe and bpf folks. >> >> As previously mentioned by myself [1] and others [2] the functions >> designed for error injection can bring visible overhead in fastpaths >> such as slab or page allocation, because even if nothing hooks into them >> at a given moment, they are noninline function calls regardless of >> CONFIG_ options since commits 4f6923fbb352 ("mm: make should_failslab >> always available for fault injection") and af3b854492f3 >> ("mm/page_alloc.c: allow error injection"). >> >> Live patching their callsites has been also suggested in both [1] and >> [2] threads, and this is an attempt to do that with static keys that >> guard the call sites. When disabled, the error injection functions still >> exist and are noinline, but are not being called. Any of the existing >> mechanisms that can inject errors should make sure to enable the >> respective static key. I have added that support to some of them but >> need help with the others. > > I think it's a clever idea and makes total sense! Thanks! >> >> Patches 3 and 4 implement the static keys for the two mm fault injection >> sites in slab and page allocators. For a quick demonstration I've run a >> VM and the simple test from [1] that stresses the slab allocator and got >> this time before the series: >> >> real 0m8.349s >> user 0m0.694s >> sys 0m7.648s >> >> with perf showing >> >> 0.61% nonexistent [kernel.kallsyms] [k] should_failslab.constprop.0 >> 0.00% nonexistent [kernel.kallsyms] [k] should_fail_alloc_page ▒ >> >> And after the series >> >> real 0m7.924s >> user 0m0.727s >> sys 0m7.191s > > Is "user" increase a measurement error or it's real? Hm interesting, I have actually did the measurement 3 times even though I pasted just one, and it's consistent. But could be just artifact of where things landed in the cache, and might change a bit with every kernel build/boot. Will see. There's no reason why this should affect user time. > Otherwise, nice savings!