On 3/7/2024 5:23 PM, Jan Kara wrote:
Thanks for testing! This is an interesting result and certainly unexpected for me. The readahead code allocates naturally aligned pages so based on the distribution of allocations it seems that before commit ab4443fe3ca6 readahead window was at least 32 pages (128KB) aligned and so we allocated order 5 pages. After the commit, the readahead window somehow ended up only aligned to 20 modulo 32. To follow natural alignment and fill 128KB readahead window we allocated order 2 page (got us to offset 24 modulo 32), then order 3 page (got us to offset 0 modulo 32), order 4 page (larger would not fit in 128KB readahead window now), and order 2 page to finish filling the readahead window. Now I'm not 100% sure why the readahead window alignment changed with different rounding when placing readahead mark - probably that's some artifact when readahead window is tiny in the beginning before we scale it up (I'll verify by tracing whether everything ends up looking correctly with the current code). So I don't expect this is a problem in ab4443fe3ca6 as such but it exposes the issue that readahead page insertion code should perhaps strive to achieve better readahead window alignment with logical file offset even at the cost of occasionally performing somewhat shorter readahead. I'll look into this once I dig out of the huge heap of email after vacation...
Hi Jan, I am also curious to this behavior and add tried add logs to understand the behavior here. Here is something difference w/o ab4443fe3ca6: - with ab4443fe3ca6: You are right about the folio order as the readahead window is 0x20. The folio order sequence is like order 2, order 4, order3, order2. But different thing is always mark the first order 2 folio readahead. So the max order is boosted to 4 in page_cache_ra_order(). The code path always hit if (index == expected || index == (ra->start + ra->size)) in ondemand_readahead(). If just change the round_down() to round_up() in ra_alloc_folio(), the major folio order will be restored to 5. - without ab4443fe3ca6: at the beginning, the folio order sequence is same like 2, 4, 3, 2. But besides the first order2 folio, order4 folio will be marked as readahead also. So it's possible the order boosted to 5. Also, not just path if (index == expected || index == (ra->start + ra->size)) is hit. but also if (folio) { can be hit (I didn't check other path as this testing is sequential read). There are some back and forth between 5 and 2,4,3,2, the order is stabilized on 5. I didn't fully understand the whole thing and will dig deeper. The above is just what the log showed. Hi Matthew, I noticed one thing when readahead folio order is being pushed forward, there are several times readahead trying to allocate and add folios to page cache. But failed as there is folio inserted to page cache cover the requested index already. Once the folio order is correct, there is no such case anymore. I suppose this is expected. Regards Yin, Fengwei