Both Kent and David have had conversations with me about improving the readahead filesystem interface this last week, and as I don't have time to write the code, here's the design. 1. Kent doesn't like it that we do an XArray lookup for each page. The proposed solution adds a (small) array of page pointers (or a pagevec) to the struct readahead_control. It may make sense to move __readahead_batch() and readahead_page() out of line at that point. This should be backed up with performance numbers. 2. David wants to be sure that readahead is aligned to a granule size (eg 256kB) to support fscache. When we last talked about it, I suggested encoding the granule size in the struct address_space. I no longer think this approach should be pursued, since ... 3. Kent wants to be able to expand readahead to encompass an entire fs extent (if, eg, that extent is compressed or encrypted). We don't know that at the right point; the filesystem can't pass that information through the generic_file_buffered_read() or filemap_fault() interface to the readahead code. So the right approach here is for the filesystem to ask the readahead code to expand the readahead batch. So solving #2 and #3 looks like a new interface for filesystems to call: void readahead_expand(struct readahead_control *rac, loff_t start, u64 len); or possibly void readahead_expand(struct readahead_control *rac, pgoff_t start, unsigned int count); It might not actually expand the readahead attempt at all -- for example, if there's already a page in the page cache, or if it can't allocate memory. But this puts the responsibility for allocating pages in the VFS, where it belongs. 4. Mike wants to be able to do 4MB I/Os [1]. That should be covered by the solution above. Mike, just to clarify. Do you need 4MB pages, or can you work with some mixture of page sizes going as far as 1024 x 4kB pages? 5. I'm allocating larger pages in the readahead code (part of the THP patch set [2]) [1] https://lore.kernel.org/linux-fsdevel/CAOg9mSSrJp2dqQTNDgucLoeQcE_E_aYPxnRe5xphhdSPYw7QtQ@xxxxxxxxxxxxxx/ [2] http://git.infradead.org/users/willy/pagecache.git/commitdiff/c00bd4082c7bc32a17b0baa29af6974286978e1f