On Fri, May 13, 2022 at 06:33:21AM +0100, Phillip Lougher wrote: > Looking at the new patch, I have a couple of questions which is worth > clarifying to have a fuller understanding of the readahead behaviour. > In otherwords I'm deducing the behaviour of the readahead calls > from context, and I want to make sure they're doing what I think > they're doing. I did write quite a lot of documentation as part of the readahead revision, and filesystem authors are the target audience, so this is somewhat disheartening to read. What could I have done better to make the readahead documentation obvious for you to find? > + nr_pages = min(readahead_count(ractl), max_pages); > > As I understand it, this will always produce nr_pages which will > cover the entirety of the block to be decompressed? That is if > a block is block_size, it will return the number of pages necessary > to decompress the entire block? It will never return less than the > necessary pages, i.e. if the block_size was 128K, it will never > return less than 32 pages? readahead_count() returns the number of pages that the page cache is asking the filesystem for. It may be any number from 1 to whatever the current readahead window is. It's possible to ask the page cache to expand the readahead request to be aligned to a decompression boundary, but that may not be possible. For example, we may be in a situation where we read pages 32-63 from the file previously, then the page cache chose to discard pages 33, 35, 37, .., 63 under memory pressure, and now the file is being re-read. This isn't a likely usage pattern, of course, but it's a situation we have to cope with. > + nr_pages = __readahead_batch(ractl, pages, max_pages) > > My understanding is that this call will fully populate the > pages array with page references without any holes. That > is none of the pages array entries will be NULL, meaning > there isn't a page for that entry. In other words, if the > pages array has 32 pages, each of the 32 entries will > reference a page. That is correct, a readahead request is always for a contiguous range of the file. The pages are allocated before calling ->readahead, so there's no opportunity for failure; they exist and they're already in the page cache, waiting for the FS to tell the pagecache that they're uptodate and unlock them.