Re: [PATCH/RFC] mm: add and use batched version of __tlb_remove_table()

Nikita Yushchenko <nikita.yushchenko@xxxxxxxxxxxxx> · Thu, 23 Dec 2021 12:55:38 +0300

I currently don't have numbers for this patch taken alone. This patch
originates from work done some years ago to reduce cost of memory
accounting, and x86-only version of this patch was in virtuozzo/openvz
kernel since then. Other patches from that work have been upstreamed,
but this one was missed.

Still it's obvious that release_pages() shall be faster that a loop
calling put_page() - isn't that exactly the reason why release_pages()
exists and is different from a loop calling put_page()?

Yep, but this patch does a bunch of stuff to some really hot paths.  It
would be greatly appreciated if you could put in the effort to actually
put some numbers behind this.  Plenty of weird stuff happens on
computers that we suck at predicting.

I found the original report about high cost of memory accounting, and tried to repeat the test described 
there, with and without the patch.

The test is - run a script in 30 openvz containers in parallel, and measure average time per execution. 
Script is attached.

I'm getting measurable improvement in average msecs per execution: 15360 ms without patch, 15170 ms with 
patch. And this difference is reliably reproducible.

Nikita
Attachment:
calcprimes.sh

Description: Bourne shell script