From: Mikulas Patocka > Sent: 20 January 2021 15:12 > > On Wed, 20 Jan 2021, Jan Kara wrote: > > > Yeah, I agree. I'm against ext4 private solution for this read problem. And > > I'm also against duplicating ->read_iter functionatily in ->read handler. > > The maintenance burden of this code duplication is IMHO just too big. We > > rather need to improve the generic code so that the fast path is faster. > > And every filesystem will benefit because this is not ext4 specific > > problem. > > > > Honza > > Do you have some idea how to optimize the generic code that calls > ->read_iter? > > vfs_read calls ->read if it is present. If not, it calls new_sync_read. > new_sync_read's frame size is 128 bytes - it holds the structures iovec, > kiocb and iov_iter. new_sync_read calls ->read_iter. > > I have found out that the cost of calling new_sync_read is 3.3%, Zhongwei > found out 3.9%. (the benchmark repeatedy reads the same 4k page) > > I don't see any way how to optimize new_sync_read or how to reduce its > frame size. Do you? Why is the 'read_iter' path not just the same as the 'read' one but calling copy_to_iter() instead of copy_to_user(). For a single fragment iov[] the difference might just be measurable for a single byte read. But by the time you are transferring 4k it ought to be miniscule. It isn't as though you have the cost of reading the iov[] from userspace. (That hits sendmsg() v send().) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)