On Wed, Mar 30, 2022 at 10:15:32PM -0700, Christoph Hellwig wrote: > On Wed, Mar 30, 2022 at 12:17:09PM -0400, Johannes Weiner wrote: > > It's add_to_page_cache_lru() that sets the flag. > > > > Basically, when a PageWorkingset (hot) page gets reclaimed, the bit is > > stored in the vacated tree slot. When the entry is brought back in, > > add_to_page_cache_lru() transfers it to the newly allocated page. > > Ok. In this case my patch didn't quite do the right thing for readahead > either. But that does leave a question for the btrfs compressed > case, which only adds extra pages to a read to readahad a bigger > cluster size - that is these pages are not read at the request of the > VM. Does it really make sense to do PSI accounting for them in that > case? I think it does. I suppose it's an argument about readahead pages in general, which technically the workload itself doesn't commission explicitly. But those pages are still triggered by a nearby access, their reads contribute to device utilization, and if they're PageWorkingset it means they're only being read because there is a lack of memory. In a perfect world, readahead would stop when memory or IO are contended. But it doesn't, and the stalls it can inject into the workload are as real as stalls from directly requested reads.