Hi Ted, On 19/8/21 00:08, Theodore Y. Ts'o wrote: > On Tue, Aug 20, 2019 at 11:00:39AM +0800, Joseph Qi wrote: >> >> I've tested parallel dio reads with dioread_nolock, it doesn't have >> significant performance improvement and still poor compared with reverting >> parallel dio reads. IMO, this is because with parallel dio reads, it take >> inode shared lock at the very beginning in ext4_direct_IO_read(). > > Why is that a problem? It's a shared lock, so parallel threads should > be able to issue reads without getting serialized? > The above just tells the result that even mounting with dioread_nolock, parallel dio reads still has poor performance than before (w/o parallel dio reads). > Are you using sufficiently fast storage devices that you're worried > about cache line bouncing of the shared lock? Or do you have some > other concern, such as some other thread taking an exclusive lock? > The test case is random read/write described in my first mail. And from my preliminary investigation, shared lock consumes more in such scenario. Thanks, Joseph