On Tue, Feb 28, 2023 at 10:52:15PM -0500, Theodore Ts'o wrote: > For example, most cloud storage devices are doing read-ahead to try to > anticipate read requests from the VM. This can interfere with the > read-ahead being done by the guest kernel. So being able to tell > cloud storage device whether a particular read request is stemming > from a read-ahead or not. At the moment, as Matthew Wilcox has > pointed out, we currently use the read-ahead code path for synchronous > buffered reads. So plumbing this information so it can passed through > multiple levels of the mm, fs, and block layers will probably be > needed. This shouldn't be _too_ painful. For example, the NVMe driver already does the right thing: if (req->cmd_flags & (REQ_FAILFAST_DEV | REQ_RAHEAD)) control |= NVME_RW_LR; if (req->cmd_flags & REQ_RAHEAD) dsmgmt |= NVME_RW_DSM_FREQ_PREFETCH; (LR is Limited Retry; FREQ_PREFETCH is "Speculative read. The command is part of a prefetch operation") The only problem is that the readahead code doesn't tell the filesystem whether the request is sync or async. This should be a simple matter of adding a new 'bool async' to the readahead_control and then setting REQ_RAHEAD based on that, rather than on whether the request came in through readahead() or read_folio() (eg see mpage_readahead()). Another thing to fix is that SCSI doesn't do anything with the REQ_RAHEAD flag, so I presume T10 has some work to do (maybe they could borrow the Access Frequency field from NVMe, since that was what the drive vendors told us they wanted; maybe they changed their minds since).