Small patch series to - firstly, refactor generic_file_buffered_read enough that it can be modified in more interesting ways without going insane, and then - secondly, change it to use find_get_pages_contig() to batch up the page operations, and then copy data to userspace in a separate loop that touches no other shared cachelines. I've been seeing profiles where the radix tree lookups in the buffered read path are a shockingly large portion of the profile (around 25%, if memory serves) - that's what this patch series is addressing. I've benchmarked small block reads as well, performance there is unaffected or slightly improved (it's within the margin of error). And as a bonus, the code that was all in generic_file_buffered_read() is now _drastically_ easier to follow and modify. I haven't done as much refactoring as I could have, I kept as much of the structure of the old code as I could just to make things easier on myself, but I'm still pretty happy with the result. Kent Overstreet (2): fs: Break generic_file_buffered_read up into multiple functions fs: generic_file_buffered_read() now uses find_get_pages_contig mm/filemap.c | 486 +++++++++++++++++++++++++++++---------------------- 1 file changed, 273 insertions(+), 213 deletions(-) -- 2.18.0