How to steer allocations to or away from subsets of physical memory?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I am looking for a way to steer allocations (these may be
by either userspace or the kernel) to or away from particular
ranges of memory. The reason for this is that some parts of
memory are different from others (i.e. some memory may be
faster/slower, some may potentially be powered off when
not in use, etc.).

One approach I have considered is to use NUMA and have
each block of memory with differing attributes be its own
node. This doesn't quite fit because:

1. Unlike the standard NUMA model, there will not be
any difference in memory access speed from
different CPUs to memory, rather an absolute difference
in access speed (or other attribute) from any CPU.
Thus the notion of a "local node" of memory bound to
each processor doesn't seem to fit.

These allocations must be steered independently of which
processor happens to be running.

2. For our use case it is not reasonable to make changes
to userspace code so that they become node-aware (i.e. have each
process use cpusets/cgroups/memory policies directly).
Even if this were possible, the user processes will need to run 
on different platforms which will have different node
layouts (i.e. there could be a varying number of nodes
of different sizes and attributes on different HW configurations
which userspace AFAIK wouldn't be able to deal with itself).

So my questions are:

1. Is NUMA the best fit here, or is there something that fits
better that I should consider?

2. If NUMA is a reasonable approach, is there already a way
to deal with nodes in a "processor independent" way (see issue
#1 above) to make the model fit our use case better?

3. We have done a "proof-of-concept" port of NUMA to ARM (at this
point artificially associating processors to nodes) and have
noticed some degradation in memory allocation time from userspace
(malloc'ing and touching various amounts of memory). This
appears to get somewhat worse as the number of nodes increases
(up to 8 which is the most we've tried), but even the case
where we enable NUMA but only have a single node is worse.
Is this to be expected, or is it simply a problem with our
initial port that should be fixable?

Thanks.

Larry Bassel

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]