On 19/8/21 11:34, Theodore Y. Ts'o wrote: > On Wed, Aug 21, 2019 at 09:04:57AM +0800, Joseph Qi wrote: >> On 19/8/21 00:08, Theodore Y. Ts'o wrote: >>> On Tue, Aug 20, 2019 at 11:00:39AM +0800, Joseph Qi wrote: >>>> >>>> I've tested parallel dio reads with dioread_nolock, it doesn't have >>>> significant performance improvement and still poor compared with reverting >>>> parallel dio reads. IMO, this is because with parallel dio reads, it take >>>> inode shared lock at the very beginning in ext4_direct_IO_read(). >>> >>> Why is that a problem? It's a shared lock, so parallel threads should >>> be able to issue reads without getting serialized? >>> >> The above just tells the result that even mounting with dioread_nolock, >> parallel dio reads still has poor performance than before (w/o parallel >> dio reads). > > Right, but you were asserting that performance hit was *because* of > the shared lock. I'm asking what leading you to have that opinion. > The fact that parallel dioread reads doesn't necessarily say that it > was because of that particular shared lock. It could be due to any > number of other things. Have you looked at /proc/lock_stat (enabeld > via CONFIG_LOCK_STAT) to see where the locking bottlenecks might be? > I've enabled CONFIG_LOCK_STAT and CONFIG_DEBUG_RWSEMS, but doesn't see any statistics for i_rwsem. Am I missing something? Thanks, Joseph