On 02-Apr-24 7:33 AM, Huang, Ying wrote: > Bharata B Rao <bharata@xxxxxxx> writes: > >> On 29-Mar-24 6:44 AM, Huang, Ying wrote: >>> Bharata B Rao <bharata@xxxxxxx> writes: >> <snip> >>>> I don't think the pages are cold but rather the existing mechanism fails >>>> to categorize them as hot. This is because the pages were scanned way >>>> before the accesses start happening. When repeated accesses are made to >>>> a chunk of memory that has been scanned a while back, none of those >>>> accesses get classified as hot because the scan time is way behind >>>> the current access time. That's the reason we are seeing the value >>>> of latency ranging from 20s to 630s as shown above. >>> >>> If repeated accesses continue, the page will be identified as hot when >>> it is scanned next time even if we don't expand the threshold range. If >>> the repeated accesses only last very short time, it makes little sense >>> to identify the pages as hot. Right? >> >> The total allocated memory here is 192G and the chunk size is 1G. Each >> time one such 1G chunk is taken up randomly for generating memory accesses. >> Within that 1G, 262144 random accesses are performed and 262144 such >> accesses are repeated for 512 times. I thought that should be enough >> to classify that chunk of memory as hot. > > IIUC, some pages are accessed in very short time (maybe within 1ms). > This isn't repeated access in a long period. I think that pages > accessed repeatedly in a long period are good candidates for promoting. > But pages accessed frequently in only very short time aren't. Here are the numbers for the 192nd chunk: Each iteration of 262144 random accesses takes around ~10ms 512 such iterations are taking ~5s numa_scan_seq is 16 when this chunk is accessed. And no page promotions were done from this chunk. All the time should_numa_migrate_memory() found the NUMA hint fault latency to be higher than threshold. Are these time periods considered too short for the pages to be detected as hot and promoted? > >> But as we see, often times >> the scan time is lagging the access time by a large value. >> >> Let me instrument the code further to learn more insights (if possible) >> about the scanning/fault time behaviors here. >> >> Leaving the fault count based threshold apart, do you think there is >> value in updating the scan time for skipped pages/PTEs during every >> scan so that the scan time remains current for all the pages? > > No, I don't think so. That makes hint page fault latency more > inaccurate. For the case that I have shown, depending on a old value of scan time doesn't work well when pages get accessed after a long time since scanning. At least with the scheme I show in patch 2/2, probability of detecting pages as hot increases. Regards, Bharata.