Aneesh Kumar K V <aneesh.kumar@xxxxxxxxxxxxx> writes: > On 9/2/22 10:39 AM, Wei Xu wrote: >> On Thu, Sep 1, 2022 at 5:33 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote: >>> >>> Aneesh Kumar K V <aneesh.kumar@xxxxxxxxxxxxx> writes: >>> >>>> On 9/1/22 12:31 PM, Huang, Ying wrote: >>>>> "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxx> writes: >>>>> >>>>>> This patch adds /sys/devices/virtual/memory_tiering/ where all memory tier >>>>>> related details can be found. All allocated memory tiers will be listed >>>>>> there as /sys/devices/virtual/memory_tiering/memory_tierN/ >>>>>> >>>>>> The nodes which are part of a specific memory tier can be listed via >>>>>> /sys/devices/virtual/memory_tiering/memory_tierN/nodes >>>>> >>>>> I think "memory_tier" is a better subsystem/bus name than >>>>> memory_tiering. Because we have a set of memory_tierN devices inside. >>>>> "memory_tier" sounds more natural. I know this is subjective, just my >>>>> preference. >>>>> > > > I missed replying to this earlier. I will keep memory_tiering as subsystem name in v4 > because we would want it to a susbsystem where all memory tiering related details can be found > including memory type in the future. This is as per discussion > > https://lore.kernel.org/linux-mm/CAAPL-u9TKbHGztAF=r-io3gkX7gorUunS2UfstudCWuihrA=0g@xxxxxxxxxxxxxx I don't think that it's a good idea to mix 2 types of devices in one subsystem (bus). If my understanding were correct, that breaks the driver core convention. >>>>>> >>>>>> A directory hierarchy looks like >>>>>> :/sys/devices/virtual/memory_tiering$ tree memory_tier4/ >>>>>> memory_tier4/ >>>>>> ├── nodes >>>>>> ├── subsystem -> ../../../../bus/memory_tiering >>>>>> └── uevent >>>>>> >>>>>> All toptier nodes are listed via >>>>>> /sys/devices/virtual/memory_tiering/toptier_nodes >>>>>> >>>>>> :/sys/devices/virtual/memory_tiering$ cat toptier_nodes >>>>>> 0,2 >>>>>> :/sys/devices/virtual/memory_tiering$ cat memory_tier4/nodes >>>>>> 0,2 >>>>> >>>>> I don't think that it is a good idea to show toptier information in user >>>>> space interface. Because it is just a in kernel implementation >>>>> details. Now, we only promote pages from !toptier to toptier. But >>>>> there may be multiple memory tiers in toptier and !toptier, we may >>>>> change the implementation in the future. For example, we may promote >>>>> pages from DRAM to HBM in the future. >>>>> >>>> >>>> >>>> In the case you describe above and others, we will always have a list of >>>> NUMA nodes from which memory promotion is not done. >>>> /sys/devices/virtual/memory_tiering/toptier_nodes shows that list. >>> >>> I don't think we will need that interface if we don't restrict promotion >>> in the future. For example, he can just check the memory tier with >>> smallest number. >>> >>> TBH, I don't know why do we need that interface. What is it for? We >>> don't want to expose unnecessary information to restrict our in kernel >>> implementation in the future. >>> >>> So, please remove that interface at least before we discussing it >>> thoroughly. >> >> I have asked for this interface to allow the userspace to query a list >> of top-tier nodes as the targets of userspace-driven promotions. The >> idea is that demotion can gradually go down tier by tier, but we >> promote hot pages directly to the top-tier and bypass the immediate >> tiers. >> >> Certainly, this can be viewed as a policy choice. Given that now we >> have a clearly defined memory tier hierarchy in sysfs and the >> toptier_nodes content can be constructed from this memory tier >> hierarchy and other information from the node sysfs interfaces, I am >> fine if we want to remove toptier_nodes and keep the current memory >> tier sysfs interfaces to the minimal. >> > > > Ok I can do a v4 with toptier_nodes dropped. Thanks! Best Regards, Huang, Ying