On Fri, Apr 19, 2024 at 02:47:21PM +0530, Kundan Kumar wrote: > When mTHP is enabled, IO can contain larger folios instead of pages. > In such cases add a larger size to the bio instead of looping through > pages. This reduces the overhead of iterating through pages for larger > block sizes. perf diff before and after this change: > > Perf diff for write I/O with 128K block size: > 1.22% -0.97% [kernel.kallsyms] [k] bio_iov_iter_get_pages > Perf diff for read I/O with 128K block size: > 4.13% -3.26% [kernel.kallsyms] [k] bio_iov_iter_get_pages I'm a bit confused by this to be honest. We already merge adjacent pages, and it doesn't look to be _that_ expensive. Can you drill down any further in the perf stats and show what the expensive part is?