On Tue, Jun 21, 2022 at 03:53:33PM +0300, Amir Goldstein wrote: > On Tue, Jun 21, 2022 at 11:59 AM Jan Kara <jack@xxxxxxx> wrote: > > > > On Tue 21-06-22 10:49:48, Amir Goldstein wrote: > > > > How exactly do you imagine the synchronization of buffered read against > > > > buffered write would work? Lock all pages for the read range in the page > > > > cache? You'd need to be careful to not bring the machine OOM when someone > > > > asks to read a huge range... > > > > > > I imagine that the atomic r/w synchronisation will remain *exactly* as it is > > > today by taking XFS_IOLOCK_SHARED around generic_file_read_iter(), > > > when reading data into user buffer, but before that, I would like to issue > > > and wait for read of the pages in the range to reduce the probability > > > of doing the read I/O under XFS_IOLOCK_SHARED. > > > > > > The pre-warm of page cache does not need to abide to the atomic read > > > semantics and it is also tolerable if some pages are evicted in between > > > pre-warn and read to user buffer - in the worst case this will result in > > > I/O amplification, but for the common case, it will be a big win for the > > > mixed random r/w performance on xfs. > > > > > > To reduce risk of page cache thrashing we can limit this optimization > > > to a maximum number of page cache pre-warm. > > > > > > The questions are: > > > 1. Does this plan sound reasonable? > > > > Ah, I see now. So essentially the idea is to pull the readahead (which is > > currently happening from filemap_read() -> filemap_get_pages()) out from under > > the i_rwsem. It looks like a fine idea to me. > > Great! > Anyone doesn't like the idea or has another suggestion? I guess I'm still confused. The problem was the the XFS IOLOCK was being held while we waited for readahead to complete. To fix this, you're planning on waiting for readahead to complete with the invalidate lock held? I don't see the benefit. I see the invalidate_lock as being roughly equivalent to the IOLOCK, just pulled up to the VFS. Is that incorrect?