On Tue, Feb 16, 2021 at 11:00 AM Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> wrote: [..] > > For > > example, the persistent memory enabling assigns the closest online > > memory node for the pmem device. That achieves the traditional > > behavior of the device-driver allocating from "local" memory by > > default. However the HMAT-sysfs representation indicates the numa node > > that pmem represents itself were it to be online. So the question is > > why does GI need more than that? To me a GI is "offline" in terms > > Linux node representations because numactl can't target it, > > That's fair. It does exist in an intermediate world. Whether the > internal representation of online vs offline should have anything much > to do with numactl rather than whether than numactl being based on > whether a node has online memory or CPUs isn't clear to me. > It already has to distinguish whether a node has CPUs and / or memory > so this isn't a huge conceptual extension. > > > "closest > > online" is good enough for a GI device driver, > > So that's the point. Is it 'good enough'? Maybe - in some cases. > > > but if userspace needs > > the next level of detail of the performance properties that's what > > HMEM sysfs is providing. > > sysfs is fine if you are doing userspace allocations or placement > decisions. For GIs it can be relevant if you are using userspace drivers > (or partially userspace drivers). That's unfortunate, please tell me there's another use for this infrastructure than userspace drivers? The kernel should not be carrying core-mm debt purely on behalf of userspace drivers. > In the GI case, from my point of view the sysfs stuff was a nice addition > but not actually that useful. The info in HMAT is useful, but too little > of it was exposed. What I don't yet have (due to lack of time), is a good > set of examples that show more info is needed. Maybe I'll get to that > one day! See the "migration in lieu of discard" [1] series as an attempt to make HMAT-like info more useful. The node-demotion infrastructure puts HMAT data to more use in the sense that the next phase of memory tiering enabling is to have the kernel automatically manage it rather than have a few enlightened apps consume the HMEM sysfs directly. [1]: http://lore.kernel.org/r/20210126003411.2AC51464@xxxxxxxxxxxxxxxxxx