Hi All, Christoph's re-factoring to xfs_ilock() brought up this question, but AFAICS, current behavior seems to have always been that way for xfs (?). Since commit 6552321831dc ("xfs: remove i_iolock and use i_rwsem in the VFS inode instead"), xfs_file_buffered_aio_read() is the only call sites I know of to call generic_file_read_iter() with i_rwsem read side held. This lock is killing performance of multi-threaded buffered read/write mixed workload on the same file [1]. Attached output of bcc tools [2] script xfsdist and ext4dist for latency distribution on the same mixed read/write workload. Compared to ext4, avg. read latency on RAID of spindles can be two orders of magnitude higher (>100ms). I can provide more performance numbers if needed with fio, but they won't surprise anyone considering the extra lock. This workload simulates a performance issue we are seeing on deployed systems. There are other ways for us to work around the issue, not using xfs on those systems would be one way, but I wanted to figure out the reason for this behavior first. My question is, is the purpose of this lock syncing dio/buffered io? If so, was making this behavior optional via mount option ever considered for xfs? Am I the first one who is asking about this specific workload on xfs (Couldn't found anything on Google), or is this a known issue/trade off/design choice of xfs? Thanks, Amir. [1] https://github.com/amir73il/filebench/blob/overlayfs-devel/workloads/randomrw.f [2] https://github.com/iovisor/bcc
Attachment:
ext4dist.out
Description: Binary data
Attachment:
xfsdist.out
Description: Binary data