On 5/13/20 2:31 PM, Pekka Enberg wrote: > Hi Jens, > > On 5/13/20 1:20 PM, Pekka Enberg wrote: >>> So I assume if someone does "perf record", they will see significant >>> reduction in page allocator activity with Jens' patch. One possible way >>> around that is forcing the page allocation order to be much higher. IOW, >>> something like the following completely untested patch: > > On 5/13/20 11:09 PM, Jens Axboe wrote: >> Now tested, I gave it a shot. This seems to bring performance to >> basically what the io_uring patch does, so that's great! Again, just in >> the microbenchmark test case, so freshly booted and just running the >> case. > > Great, thanks for testing! > > On 5/13/20 11:09 PM, Jens Axboe wrote: >> Will this patch introduce latencies or non-deterministic behavior for a >> fragmented system? > > You have to talk to someone who is more up-to-date with how the page > allocator operates today. But yeah, I assume people still want to avoid > higher-order allocations as much as possible, because they make > allocation harder when memory is fragmented. That was my thinking... I don't want a random io_kiocb allocation to take a long time because of high order allocations. > That said, perhaps it's not going to the page allocator as much as I > thought, but the problem is that the per-CPU cache size is just to small > for these allocations, forcing do_slab_free() to take the slow path > often. Would be interesting to know if CONFIG_SLAB does better here > because the per-CPU cache size is much larger IIRC. Just tried with SLAB, and it's roughly 4-5% down from the baseline (non-modified) SLUB. So not faster, at least for this case. -- Jens Axboe