Re: [LSF/MM/BPF TOPIC] Measuring limits and enhancing buffered IO

Kent Overstreet <kent.overstreet@xxxxxxxxx> · Tue, 27 Feb 2024 10:54:15 -0500



On Tue, Feb 27, 2024 at 03:39:35PM +0000, Matthew Wilcox wrote:
> On Tue, Feb 27, 2024 at 02:21:59AM -0500, Kent Overstreet wrote:
> > On Mon, Feb 26, 2024 at 03:48:35PM -0800, Linus Torvalds wrote:
> > > On Mon, 26 Feb 2024 at 14:46, Linus Torvalds
> > > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > I really haven't tested this AT ALL. I'm much too scared.
> > > 
> > > "Courage is not the absence of fear, but acting in spite of it"
> > >          - Paddington Bear / Michal Scott
> > > 
> > > It seems to actually boot here.
> > > 
> > > That said, from a quick test with lots of threads all hammering on the
> > > same page - I'm still not entirely convinced it makes a difference.
> > > Sure, the kernel profile changes, but filemap_get_read_batch() wasn't
> > > very high up in the profile to begin with.
> > > 
> > > I didn't do any actual performance testing, I just did a 64-byte pread
> > > at offset 0 in a loop in 64 threads on my 32c/64t machine.
> > 
> > Only rough testing, but  this is looking like around a 25% performance
> > increase doing 4k random reads on a 1G file with fio, 8 jobs, on my
> > Ryzen 5950x - 16.7M -> 21.4M iops, very roughly. fio's a pig and we're
> > only spending half our cpu time in the kernel, so the buffered read path
> > is actually getting 40% or 50% faster.
> 
> Linus' patch only kicks in for 128 bytes or smaller.  So what are you
> measuring?

64 byte reads