On Tue, Feb 27, 2024 at 03:39:35PM +0000, Matthew Wilcox wrote: > On Tue, Feb 27, 2024 at 02:21:59AM -0500, Kent Overstreet wrote: > > On Mon, Feb 26, 2024 at 03:48:35PM -0800, Linus Torvalds wrote: > > > On Mon, 26 Feb 2024 at 14:46, Linus Torvalds > > > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > I really haven't tested this AT ALL. I'm much too scared. > > > > > > "Courage is not the absence of fear, but acting in spite of it" > > > - Paddington Bear / Michal Scott > > > > > > It seems to actually boot here. > > > > > > That said, from a quick test with lots of threads all hammering on the > > > same page - I'm still not entirely convinced it makes a difference. > > > Sure, the kernel profile changes, but filemap_get_read_batch() wasn't > > > very high up in the profile to begin with. > > > > > > I didn't do any actual performance testing, I just did a 64-byte pread > > > at offset 0 in a loop in 64 threads on my 32c/64t machine. > > > > Only rough testing, but this is looking like around a 25% performance > > increase doing 4k random reads on a 1G file with fio, 8 jobs, on my > > Ryzen 5950x - 16.7M -> 21.4M iops, very roughly. fio's a pig and we're > > only spending half our cpu time in the kernel, so the buffered read path > > is actually getting 40% or 50% faster. > > Linus' patch only kicks in for 128 bytes or smaller. So what are you > measuring? 64 byte reads