On Wed, Nov 16, 2022 at 3:53 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: > > On Wed, Nov 16, 2022 at 10:47:04PM +0000, Edgecombe, Rick P wrote: > > On Wed, 2022-11-16 at 14:33 -0800, Luis Chamberlain wrote: > > > More in lines with what I was hoping for. Can something just do > > > the parallelization for you in one shot? Can bench alone do it for > > > you? > > > Is there no interest to have soemthing which generically showcases > > > multithreading / hammering a system with tons of eBPF JITs? It may > > > prove useful. > > > > > > And also, it begs the question, what if you had another iTLB generic > > > benchmark or genearl memory pressure workload running *as* you run > > > the > > > above? I as, as it was my understanding that one of the issues was > > > the > > > long term slowdown caused by the directmap fragmentation without > > > bpf_prog_pack, and so such an application should crawl to its knees > > > over time, and there should be numbers you could show to prove that > > > too, before and after. > > > > We did have some benchmarks that showed if your direct map was totally > > fragmented (started from boot at 4k page size) what the regression was: > > > > > > https://lore.kernel.org/linux-mm/213b4567-46ce-f116-9cdf-bbd0c884eb3c@xxxxxxxxxxxxxxx/ > > Oh yes that is a good example of effort, but I'm suggesting taking for > instance will-it-scale and run it in tandem with bpg prog pack > and measure on *both* iTLB differences, before / after, *and* doing > this again after a period of expected deterioation of the direct > map fragmentation (say after non-bpf-prog-pack shows high direct > map fragmetnation). > > This is the sort of thing which easily go into a commit log. To be honest, I don't see experimental results with artificial benchmarks would help this set. I don't think a real workload would see 10% speed up from this set (we can see large % improvements in TLB miss rate though). However, 1% or even 0.5% improvement matters a lot for large scale workload. Thanks, Song