On Mon, May 23, 2022 at 09:44:12AM -0600, Jens Axboe wrote: > On 5/23/22 9:12 AM, Jens Axboe wrote: > >> Current branch pushed to #new.iov_iter (at the moment; will rename > >> back to work.iov_iter once it gets more or less stable). > > > > Sounds good, I'll see what I need to rebase. > > On the previous branch, ran a few quick numbers. dd from /dev/zero to > /dev/null, with /dev/zero using ->read() as it does by default: > > 32 260MB/sec > 1k 6.6GB/sec > 4k 17.9GB/sec > 16k 28.8GB/sec > > now comment out ->read() so it uses ->read_iter() instead: > > 32 259MB/sec > 1k 6.6GB/sec > 4k 18.0GB/sec > 16k 28.6GB/sec > > which are roughly identical, all things considered. Just a sanity check, > but looks good from a performance POV in this basic test. > > Now let's do ->read_iter() but make iov_iter_zero() copy from the zero > page instead: > > 32 250MB/sec > 1k 7.7GB/sec > 4k 28.8GB/sec > 16k 71.2GB/sec > > Looks like it's a tad slower for 32-bytes, considerably better for 1k, > and massively better at page size and above. This is on an Intel 12900K, > so recent CPU. Let's try cacheline and above: > > Size Method BW > 64 copy_from_zero() 508MB/sec > 128 copy_from_zero() 1.0GB/sec > 64 clear_user() 513MB/sec > 128 clear_user() 1.0GB/sec See this thread-of-doom: https://lore.kernel.org/all/Ynq1nVpu1xCpjnXm@xxxxxxx/