On Tue, 25 Jun 2024, Ryan Roberts wrote:
But I also want to raise a more general point; We are not done with the optimizations yet. contpte can also improve performance for iTLB, but this requires a change to the page cache to store text in (at least) 64K folios. Typically the iTLB is under a lot of pressure and this can help reduce it. This change is not in mainline yet (and I still need to figure out how to make the patch acceptable), but is worth another ~1.5% for the 4KPS case. I suspect this will also move the needle on the other benchmarks you ran. See [3] - I'd appreciate any thoughts you have on how to get something like this accepted. [3] https://lore.kernel.org/lkml/20240111154106.3692206-1-ryan.roberts@xxxxxxx/
The discussion here seems to indicate that readahead is already ok for order-2 (16K mTHP size?). So this is only for 64K mTHP on 4K?