Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > We're defeating the ondemand_readahead() algorithm here. Let's suppose > userspace is doing 64kB reads, the filesystem is OrangeFS which only > wants to do 4MB reads, the page cache is initially empty and there's > only one thread doing a sequential read. ondemand_readahead() calls > get_init_ra_size() which tells it to allocate 128kB and set the async > marker at 64kB. Then orangefs calls readahead_expand() to allocate the > remainder of the 4MB. After the app has read the first 64kB, it comes > back to read the next 64kB, sees the readahead marker and tries to trigger > the next batch of readahead, but it's already present, so it does nothing > (see page_cache_ra_unbounded() for what happens with pages present). It sounds like Christoph is right on the right track and the vm needs to ask the filesystem (and by extension, the cache) before doing the allocation and before setting the trigger flag. Then we don't need to call back into the vm to expand the readahead. Also, there's Steve's request to try and keep at least two requests in flight for CIFS/SMB at the same time to consider. David