Re: [PATCH bpf-next v2 0/5] execmem_alloc for BPF programs

Song Liu <song@xxxxxxxxxx> · Wed, 16 Nov 2022 17:17:21 -0800

On Wed, Nov 16, 2022 at 3:53 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote:
>
> On Wed, Nov 16, 2022 at 10:47:04PM +0000, Edgecombe, Rick P wrote:
> > On Wed, 2022-11-16 at 14:33 -0800, Luis Chamberlain wrote:
> > > More in lines with what I was hoping for. Can something just do
> > > the parallelization for you in one shot? Can bench alone do it for
> > > you?
> > > Is there no interest to have soemthing which generically showcases
> > > multithreading / hammering a system with tons of eBPF JITs? It may
> > > prove useful.
> > >
> > > And also, it begs the question, what if you had another iTLB generic
> > > benchmark or genearl memory pressure workload running *as* you run
> > > the
> > > above? I as, as it was my understanding that one of the issues was
> > > the
> > > long term slowdown caused by the directmap fragmentation without
> > > bpf_prog_pack, and so such an application should crawl to its knees
> > > over time, and there should be numbers you could show to prove that
> > > too, before and after.
> >
> > We did have some benchmarks that showed if your direct map was totally
> > fragmented (started from boot at 4k page size) what the regression was:
> >
> >
> > https://lore.kernel.org/linux-mm/213b4567-46ce-f116-9cdf-bbd0c884eb3c@xxxxxxxxxxxxxxx/
>
> Oh yes that is a good example of effort, but I'm suggesting taking for
> instance will-it-scale and run it in tandem with bpg prog pack
> and measure on *both* iTLB differences, before / after, *and* doing
> this again after a period of expected deterioation of the direct
> map fragmentation (say after non-bpf-prog-pack shows high direct
> map fragmetnation).
>
> This is the sort of thing which easily go into a commit log.

To be honest, I don't see experimental results with artificial benchmarks
would help this set. I don't think a real workload would see 10% speed
up from this set (we can see large % improvements in TLB miss rate
though). However, 1% or even 0.5% improvement matters a lot for
large scale workload.

Thanks,
Song