On Thu, May 06, 2021 at 11:19:03AM -0600, Jens Axboe wrote: > Doing a quick profile, on the latter run with ->write_iter() we're > spending 8% of the time in _copy_from_iter(), and 4% in > new_sync_write(). That's obviously not there at all for the first case. > Both have about 4% in eventfd_write(). Non-iter case spends 1% in > copy_from_user(). > > Finally with your branch pulled in as well, iow using ->write_iter() for > eventfd and your iov changes: > > Executed in 485.26 millis fish external > usr time 103.09 millis 70.00 micros 103.03 millis > sys time 382.18 millis 83.00 micros 382.09 millis > > Executed in 485.16 millis fish external > usr time 104.07 millis 69.00 micros 104.00 millis > sys time 381.09 millis 94.00 micros 381.00 millis > > and there's no real difference there. We're spending less time in > _copy_from_iter() (8% -> 6%) and less time in new_sync_write(), but > doesn't seem to manifest itself in reduced runtime. Interesting... do you have instruction-level profiles for _copy_from_iter() and new_sync_write() on the last of those trees?