On Fri, 31 Jan 2025 12:28:03 +0000 Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> wrote: > > Here is the list of potential discussion points: > ... > > > 2. Possibility of maintaining single source of truth for page hotness that would > > maintain hot page information from multiple sources and let other sub-systems > > use that info. > Hi, > > I was thinking of proposing a separate topic on a single source of hotness, > but this question covers it so I'll add some thoughts here instead. > I think we are very early, but sharing some experience and thoughts in a > session may be useful. Thinking more on this over lunch, I think it is worth calling this out as a potential session topic in it's own right rather than trying to find time within other sessions. Hence the title change. I think a session would start with a brief listing of the temperature sources we have and those on the horizon to motivate what we are unifying, then discussion to focus on need for such a unification + requirements (maybe with a straw man). > > What do the other subsystems that want to use a single source of page hotness > want to be able to find out? (subject to filters like memory range, process etc) > > A) How hot is page X? > - Is this useful, or too much data? What would use it? > * Application optimization maybe. Very handy for developing algorithms > to do the rest of the options here as an Oracle! > - Provides both the cold and hot end of the scale, but maybe measurement > techniques vary and can not be easily combined. Hard in general to combine > multiple sources of truth if aiming for an absolute number. > > B) Which pages are super hot? > - Probably these that make the most difference if they are in a slower memory tier. > > C) Some pages are hot enough to consider moving? > - This may be good enough to get the key data into the fast memory over time. > - Can combine sources of info as being able to compare precise numbers doesn't matter. > > D) Which pages are fairly cold? > - Likewise maybe good enough over time. > > E) Which pages are very cold? > - Ideal case for tiering. Swap these with the super hot ones. > - Maybe extra signal for swap / zswap etc > > F) Did these hot pages remain hot (and same for cold) > - This is needed to know when to back off doing things as we have unstable > hotness (two phase applications are a pain for this), sampling a few > pages may be fine. > > Messy corners: > > Temporal aspects. > - If only providing lists of hottest / coldest in last second, very hard > to find those that are of a stable temperature. We end up moving > very hot data (which is disruptive) and it doesn't stay hot. > - Can reduce that affect by long sampling windows on some measurement approaches > (on hardware trackers that can trash accuracy due to resource exhaustion > and other subtle effects). > - bistable / phase based applications are a pain but perhaps up to higher > levels to back off. > > My main interest is migrating in tiered systems but good to look at what > else would use a common layer. > > Mostly I want to know something that is useful to move, and assume convergence > over the long term with the best things to move so to me the ideal layer has > following interface (strawman so shoot holes in it!): > > 1) Give me up to X hotish pages from a slow tier (greater than a specific measure > of temperature) > 2) Give me X coldish pages a faster tier. > 3) I expect to ask again in X seconds so please have some info ready for me! > 4) (a path to get an idea of 'unhelpful moves' from earlier iterations - this > is bleeding the tiering application into a shared interface though). > > If we have multiple subsystems using the data we will need to resolve their > conflicting demands to generate good enough data with appropriate overhead. > > I'd also like a virtualized solution for case of hardware PA trackers (what > I have with CXL Hotness Monitoring Units) and classic memory pool / stranding > avoidance case where the VM is the right entity to make migration decisions. > Making that interface convey what the kernel is going to use would be an > efficient option. I'd like to hide how the sausage was made from the VM. > > Jonathan >