From: David Howells > Sent: 15 September 2023 11:10 > > David Laight <David.Laight@xxxxxxxxxx> wrote: > > > > Add kunit tests to benchmark 256MiB copies to a UBUF iterator and an IOVEC > > > iterator. This attaches a userspace VM with a mapped file in it > > > temporarily to the test thread. > > > > Isn't that going to be completely dominated by the cache fills > > from memory? > > Yes... but it should be consistent in the amount of time that consumes since > no device drivers are involved. I can try adding the same folio to the > anon_file multiple times - it might work especially if I don't put the pages > on the LRU (if that's even possible) - but I wanted separate pages for the > extraction test. You could also just not do the copy! Although you need (say) asm volatile("\n",:::"memory") to stop it all being completely optimised away. That might show up a difference in the 'out_of_line' test where 15% on top on the data copies is massive - it may be that the data cache behaviour is very different for the two cases. ... > > Some measurements can be made using readv() and writev() > > on /dev/zero and /dev/null. > > Forget /dev/null; that doesn't actually engage any iteration code. The same > for writing to /dev/zero. Reading from /dev/zero does its own iteration thing > rather than using iterate_and_advance(), presumably because it checks for > signals and resched. Using /dev/null does exercise the 'copy iov from user' code. Last time I looked at that the 32bit compat code was faster than the 64bit code on x86! David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)