On Tue, Aug 20, 2019 at 11:00:39AM +0800, Joseph Qi wrote: > > I've tested parallel dio reads with dioread_nolock, it doesn't have > significant performance improvement and still poor compared with reverting > parallel dio reads. IMO, this is because with parallel dio reads, it take > inode shared lock at the very beginning in ext4_direct_IO_read(). Why is that a problem? It's a shared lock, so parallel threads should be able to issue reads without getting serialized? Are you using sufficiently fast storage devices that you're worried about cache line bouncing of the shared lock? Or do you have some other concern, such as some other thread taking an exclusive lock? - Ted