Re: RFC: Memory Tiering Kernel Interfaces

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Wei Xu <weixugc@xxxxxxxxxx> writes:

> On Mon, May 9, 2022 at 7:32 AM Hesham Almatary
> <hesham.almatary@xxxxxxxxxx> wrote:
>>

....

> > nearest lower tier before demoting to lower lower tiers.
>> There might still be simple cases/topologies where we might want to "skip"
>> the very next lower tier. For example, assume we have a 3 tiered memory
>> system as follows:
>>
>> node 0 has a CPU and DDR memory in tier 0, node 1 has GPU and DDR memory
>> in tier 0,
>> node 2 has NVMM memory in tier 1, node 3 has some sort of bigger memory
>> (could be a bigger DDR or something) in tier 2. The distances are as
>> follows:
>>
>> --------------          --------------
>> |   Node 0   |          |   Node 1   |
>> |  -------   |          |  -------   |
>> | |  DDR  |  |          | |  DDR  |  |
>> |  -------   |          |  -------   |
>> |            |          |            |
>> --------------          --------------
>>         | 20               | 120    |
>>         v                  v        |
>> ----------------------------       |
>> | Node 2     PMEM          |       | 100
>> ----------------------------       |
>>         | 100                       |
>>         v                           v
>> --------------------------------------
>> | Node 3    Large mem                |
>> --------------------------------------
>>
>> node distances:
>> node   0    1    2    3
>>     0  10   20   20  120
>>     1  20   10  120  100
>>     2  20  120   10  100
>>     3  120 100  100   10
>>
>> /sys/devices/system/node/memory_tiers
>> 0-1
>> 2
>> 3
>>
>> N_TOPTIER_MEMORY: 0-1
>>
>>
>> In this case, we want to be able to "skip" the demotion path from Node 1
>> to Node 2,
>>
>> and make demotion go directely to Node 3 as it is closer, distance wise.
>> How can
>>
>> we accommodate this scenario (or at least not rule it out as future
>> work) with the current RFC?
>
> This is an interesting example.  I think one way to support this is to
> allow all the lower tier nodes to be the demotion targets of a node in
> the higher tier.  We can then use the allocation fallback order to
> select the best demotion target.
>
> For this example, we will have the demotion targets of each node as:
>
> node 0: allowed=2-3, order (based on allocation fallback order): 2, 3
> node 1: allowed=2-3, order (based on allocation fallback order): 3, 2
> node 2: allowed = 3, order (based on allocation fallback order): 3
> node 3: allowed = empty
>
> What do you think?
>

Can we simplify this further with

tier 0 - > empty (no HBM/GPU)
tier 1 ->  Node0, Node1
tier 2 ->  Node2, Node3

Hence

 node 0: allowed=2-3, order (based on allocation fallback order): 2, 3
 node 1: allowed=2-3, order (based on allocation fallback order): 3, 2
 node 2: allowed = empty
 node 3: allowed = empty

-aneesh




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux