On Mon, Nov 11, 2024 at 11:05 PM David Hildenbrand <david@xxxxxxxxxx> wrote: > > On 11.11.24 15:28, Yafang Shao wrote: > > On Mon, Nov 11, 2024 at 6:33 PM David Hildenbrand <david@xxxxxxxxxx> wrote: > >> > >> On 08.11.24 15:17, Yafang Shao wrote: > >>> When testing large folio support with XFS on our servers, we observed that > >>> only a few large folios are mapped when reading large files via mmap. > >>> After a thorough analysis, I identified it was caused by the > >>> `/sys/block/*/queue/read_ahead_kb` setting. On our test servers, this > >>> parameter is set to 128KB. After I tune it to 2MB, the large folio can > >>> work as expected. However, I believe the large folio behavior should not be > >>> dependent on the value of read_ahead_kb. It would be more robust if the > >>> kernel can automatically adopt to it. > >> > >> Now I am extremely confused. > >> > >> Documentation/ABI/stable/sysfs-block: > >> > >> "[RW] Maximum number of kilobytes to read-ahead for filesystems on this > >> block device." > >> > >> > >> So, with your patch, will we also be changing the readahead size to > >> exceed that, or simply allocate larger folios and not exceeding the > >> readahead size (e.g., leaving them partially non-filled)? > > > > Exceeding the readahead size for the MADV_HUGEPAGE case is > > straightforward; this is what the current patch accomplishes. > > > > Okay, so this only applies with MADV_HUGEPAGE I assume. Likely we should > also make that clearer in the subject. > > mm/readahead: allow exceeding configured read_ahead_kb with MADV_HUGEPAGE > > > If this is really a fix, especially one that deserves CC-stable, I > cannot tell. Willy is the obvious expert :) > > >> > >> If you're also changing the readahead behavior to exceed the > >> configuration parameter it would sound to me like "I am pushing the > >> brake pedal and my care brakes; fix the brakes to adopt whether to brake > >> automatically" :) > >> > >> Likely I am missing something here, and how the read_ahead_kb parameter > >> is used after your patch. > > > > The read_ahead_kb parameter continues to function for > > non-MADV_HUGEPAGE scenarios, whereas special handling is required for > > the MADV_HUGEPAGE case. It appears that we ought to update the > > Documentation/ABI/stable/sysfs-block to reflect the changes related to > > large folios, correct? > > Yes, how it related to MADV_HUGEPAGE. I would assume that it would get > ignored, but ... > > ... staring at get_next_ra_size(), it's not quite ignored, because we > still us it as a baseline to detect how much we want to bump up the > limit when the requested size is small? (*2 vs *4 etc) :/ > > So the semantics are really starting to get weird, unless I am missing > something important. > > [...] > > > Perhaps a more straightforward solution would be to implement it > > directly at the callsite, as demonstrated below? > > Likely something into this direction might be better, but Willy is the > expert that code. > > > > > diff --git a/mm/readahead.c b/mm/readahead.c > > index 3dc6c7a128dd..187efae95b02 100644 > > --- a/mm/readahead.c > > +++ b/mm/readahead.c > > @@ -642,7 +642,11 @@ void page_cache_async_ra(struct readahead_control *ractl, > > 1UL << order); > > if (index == expected) { > > ra->start += ra->size; > > - ra->size = get_next_ra_size(ra, max_pages); > > + /* > > + * Allow the actual size to exceed the readahead window for a > > + * large folio. > > "a large folio" -> "with MADV_HUGEPAGE" ? Or can this be hit on > different paths that are not covered in the patch description? This branch may also be triggered by other large folios that are not necessarily order-9. Therefore, I’ve referred to it as a 'large folio' rather than associating it specifically with MADV_HUGEPAGE. If we were to handle only the MADV_HUGEPAGE case, we would proceed as outlined in the initial RFC patch[0]. However, following Willy's recommendation, I implemented it this way, as he likely has a deeper understanding of the intended behavior. [0]. https://lore.kernel.org/linux-mm/20241104143015.34684-1-laoar.shao@xxxxxxxxx/ -- Regards Yafang