On Mon, 2022-11-07 at 15:39 -0800, Luis Chamberlain wrote: > On Mon, Nov 07, 2022 at 03:13:59PM -0800, Song Liu wrote: > > The benchmark used here is identical on our web service, which runs > > on > > many many servers, so it represents the workload that we care a > > lot. > > Unfortunately, it is not possible to run it out of our data > > centers. > > I am not asking for that, I am asking for you to pick any similar > benchark which can run in paralellel which may yield similar results. > > > We can build some artificial workloads and probably get much higher > > performance improvements. But these workload may not represent real > > world use cases. > > You can very likely use some existing benchmark. > > The direct map fragmentation stuff doesn't require much effort, as > was demonstrated by Aaron, you can easily do that or more by > running all selftests or just the test_bpf. This I buy. > > I'm not buying the iTLB gains as I can't even reproduce them myself > for > eBPF JIT, but I tested against iTLB when using eBPF JIT, perhaps you > mean iTLB gains for other memory intensive applications running in > tandem? Song, didn't you find that there wasn't (or in the noise) iTLB gains? What is this about visible performance drop from iTLB misses? IIRC there was a test done where progpack mapped things at 4k, but in 2MB chunks, so it would re-use pages like the 2MB mapped version. And it didn't see much improvement over the 2MB mapped version. Did I remember that wrong? > > And none of your patches mentions the gains of this effort helping > with the long term advantage of centralizing the semantics for > permissions on memory. Another good point. Although this brings up that this interface "execmem" does just handle one type of permission.