Matthew Wilcox <willy@xxxxxxxxxxxxx> writes: > On Sun, Aug 30, 2020 at 10:54:35AM +0900, OGAWA Hirofumi wrote: >> Matthew Wilcox <willy@xxxxxxxxxxxxx> writes: >> >> Hm, io_pages is limited by driver setting too, and io_pages can be lower >> than ra_pages, e.g. usb storage. >> >> Assuming ra_pages is user intent of readahead window. So if io_pages is >> lower than ra_pages, this try ra_pages to align of io_pages chunk, but >> not bigger than ra_pages. Because if block layer splits I/O requests to >> hard limit, then I/O is not optimal. >> >> So it is intent, I can be misunderstanding though. > > Looking at this some more, I'm not sure it makes sense to consult ->io_pages > at all. I see how it gets set to 0 -- the admin can write '1' to > /sys/block/<device>/queue/max_sectors_kb and that gets turned into 0 > in ->io_pages. if (max_sectors_kb > max_hw_sectors_kb || max_sectors_kb < page_kb) return -EINVAL; It should not set to 0 via /sys/.../max_sectors_kb. However the default of bdi->io_pages is 0. So it happened if a driver didn't initialized it. > But I'm not sure it makes any sense to respect that. Looking at > mm/readahead.c, all it does is limit the size of a read request which > exceeds the current readahead window. It's not used to limit the > readahead window itself. For example: > > unsigned long max_pages = ra->ra_pages; > ... > if (req_size > max_pages && bdi->io_pages > max_pages) > max_pages = min(req_size, bdi->io_pages); > > Setting io_pages below ra_pages has no effect. So maybe fat should also > disregard it? |-----------------------| requested blocks [before] ra_pages |===========|===========|===========| io_pages |---------|---------|---------| req |---------|-|-------|---| [after] ra_pages |=========|=========|=========| io_pages |---------|---------|---------| req |---------|---------|---| This path is known the large sequential read there. Well, anyway, this intent is to use [after] as 3 req, instead of [before] as 4 req. Thanks. -- OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>