On Thu, Mar 02, 2023 at 10:55:48PM -0800, Nhat Pham wrote: > On Sun, Feb 19, 2023 at 4:21 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > +/** > > > + * filemap_cachestat() - compute the page cache statistics of a mapping > > > + * @mapping: The mapping to compute the statistics for. > > > + * @first_index: The starting page cache index. > > > + * @last_index: The final page index (inclusive). > > > + * @cs: the cachestat struct to write the result to. > > > + * > > > + * This will query the page cache statistics of a mapping in the > > > + * page range of [first_index, last_index] (inclusive). The statistics > > > + * queried include: number of dirty pages, number of pages marked for > > > + * writeback, and the number of (recently) evicted pages. > > > + */ > > > > Do we care that this isn't going to work for hugetlbfs? > > I ran a quick test using hugetlbfs. It looks like the current > implementation is treating it in accordance to the multi-page > folio case we discussed earlier, i.e: > > - Returned number of "pages" is in terms of the number of > base/small pages (i.e 512 dirty pages instead of 1 dirty > huge page etc.) > - If we touch one byte in the huge page, it would report the > entire huge page as dirty, but again in terms of the underlying > pages. > > Is this what you have in mind, or is there another edge > case that I'm missing...? Hugetlbfs indexes its pages by hugepage number rather than by smallpage number. Imagine you have a 2MB folio at offset 4MB into the file. Filesystems other than hugetlbfs store it at indices 1024-1535. hugetlbfs stores it at index 2. So your report probably seems to work, but if you ask it about a range, you might be surprised by how wide that range will cover for hugetlbfs. I know Sidhartha is working on fixing that, but I'm not sure if what he has is working yet.