Re: numa_alloc_onnode does not allocate on node passed as argument

Andres Nötzli <noetzli@xxxxxxxxxxxx> · Thu, 30 Oct 2014 16:21:11 -0700

Hi Elena,

Thank you so much for looking into this issue. It is good to hear that you are getting the same strange result.

I posted the output of /proc/pid/numa_maps here: https://gist.github.com/4tXJ7f/5e89f466e29cd1f7f1aa

I hope this helps.

Thanks again,
Andres

> On 29 Oct 2014, at 21:33, Elena Ufimtseva <ufimtseva@xxxxxxxxx> wrote:
> 
> Hello Andres
> 
> I looked at the example you gave, had multiple variations running and
> have same strange results.
> The default local policy should be in use when there is no other policy defined.
> The only thing what comes to my mind its the shared library libnuma
> which has its data on different node then the node I try to run
> the test process on.
> Can you take a look and check what node is used by libnuma in
> /proc/pid/numa_maps?
> 
> I will keep searching for an answer, its rather interesting topic.
> Or maybe someone else will give more details on this.
> 
> Thank you!
> 
> 
> On Thu, Oct 23, 2014 at 1:17 PM, Andres Nötzli <noetzli@xxxxxxxxxxxx> wrote:
>> Hi Elena,
>> 
>> That would be great! I created a gist with the kernel config (cat /boot/config-$(uname -r)): https://gist.github.com/4tXJ7f/408a562abe5d4f28656d
>> 
>> Please let me know if you need anything else.
>> 
>> Thank you very much,
>> Andres
>> 
>>> On 23 Oct 2014, at 06:15, Elena Ufimtseva <ufimtseva@xxxxxxxxx> wrote:
>>> 
>>> Hi Andres
>>> 
>>> I will poke around this on the weekend on my NUMA machine.
>>> Can you also attach your kernel config please?
>>> 
>>> Thank you.
>>> 
>>> On Wed, Oct 22, 2014 at 12:40 PM, Andres Nötzli <noetzli@xxxxxxxxxxxx> wrote:
>>>> Hi Elena,
>>>> 
>>>> Thank you very much for your quick reply! numa_set_strict(1) and numa_set_strict(0) both result in the wrong output. I did not change the default policy.
>>>> 
>>>> numa_get_membind returns 1 for all nodes before and after numa_run_on_node.
>>>> numa_get_interleave_mask returns 0 for all nodes.
>>>> numa_get_run_node_mask is all 1s before and 0010 after numa_run_on_node.
>>>> 
>>>> The machine config (the CPUs are all Intel(R) Xeon(R) CPU E5-4657L v2 @ 2.40GHz):
>>>> 
>>>> $ numactl --hardware
>>>> available: 4 nodes (0-3)
>>>> node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 48 49 50 51 52 53 54 55 56 57 58 59
>>>> node 0 size: 262093 MB
>>>> node 0 free: 966 MB
>>>> node 1 cpus: 12 13 14 15 16 17 18 19 20 21 22 23 60 61 62 63 64 65 66 67 68 69 70 71
>>>> node 1 size: 262144 MB
>>>> node 1 free: 82 MB
>>>> node 2 cpus: 24 25 26 27 28 29 30 31 32 33 34 35 72 73 74 75 76 77 78 79 80 81 82 83
>>>> node 2 size: 262144 MB
>>>> node 2 free: 102 MB
>>>> node 3 cpus: 36 37 38 39 40 41 42 43 44 45 46 47 84 85 86 87 88 89 90 91 92 93 94 95
>>>> node 3 size: 262144 MB
>>>> node 3 free: 113 MB
>>>> node distances:
>>>> node   0   1   2   3
>>>> 0:  10  20  30  20
>>>> 1:  20  10  20  30
>>>> 2:  30  20  10  20
>>>> 3:  20  30  20  10
>>>> 
>>>> Thanks again,
>>>> Andres
>>>> 
>>>>> On 22 Oct 2014, at 06:12, Elena Ufimtseva <ufimtseva@xxxxxxxxx> wrote:
>>>>> 
>>>>> On Tue, Oct 21, 2014 at 11:47 PM, Andres Nötzli <noetzli@xxxxxxxxxxxx> wrote:
>>>>>> Hi everyone,
>>>>>> 
>>>>>> I am experiencing a weird problem. When using numa_alloc_onnode repeatedly to allocate memory, it does not allocate memory on the node passed as an argument.
>>>>>> 
>>>>>> Sample code:
>>>>>> #include <numa.h>
>>>>>> #include <numaif.h>
>>>>>> #include <iostream>
>>>>>> using namespace std;
>>>>>> 
>>>>>> void find_memory_node_for_addr(void* ptr) {
>>>>>> int numa_node = -1;
>>>>>> if(get_mempolicy(&numa_node, NULL, 0, ptr, MPOL_F_NODE | MPOL_F_ADDR) < 0)
>>>>>>    cout << "WARNING: get_mempolicy failed" << endl;
>>>>>> cout << numa_node << endl;
>>>>>> }
>>>>>> 
>>>>>> int main() {
>>>>>> int64_t* x;
>>>>>> int64_t n = 5000;
>>>>>> //numa_set_preferred(1);
>>>>>> 
>>>>>> numa_run_on_node(2);
>>>>>> for(int i = 0; i < 20; i++) {
>>>>>>    size_t s = n * sizeof(int64_t);
>>>>>>    x = (int64_t*)numa_alloc_onnode(s, 1);
>>>>>>    for(int j = 0; j < n; j++)
>>>>>>       x[j] = j + i;
>>>>>>    find_memory_node_for_addr(x);
>>>>>> }
>>>>>> 
>>>>>> return 0;
>>>>>> }
>>>>>> 
>>>>>> Output:
>>>>>> 1
>>>>>> 1
>>>>>> 1
>>>>>> 2
>>>>>> 1
>>>>>> 2
>>>>>> 1
>>>>>> 2
>>>>>> 1
>>>>>> 2
>>>>>> 1
>>>>>> 2
>>>>>> 1
>>>>>> 2
>>>>>> 1
>>>>>> 2
>>>>>> 1
>>>>>> 2
>>>>>> 1
>>>>>> 2
>>>>>> 
>>>>>> When uncommenting the line "numa_set_preferred(1);”, the output is all 1s as expected. Am I doing something wrong? Have you seen similar issues?
>>>>>> 
>>>>>> I am running Ubuntu 12.04.5 LTS:
>>>>>> $ cat /proc/version
>>>>>> Linux version 3.2.0-29-generic (buildd@allspice) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012
>>>>>> 
>>>>>> I am using libnuma 2.0.10 but I’ve had the same problem with 2.0.8~rc3-1.
>>>>>> 
>>>>>> Thank you very much,
>>>>>> Andres
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-numa" in
>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>> 
>>>>> Hi Andres
>>>>> 
>>>>> Can you try to use strict policy by calling numa_set_strict?
>>>>> 
>>>>> If you comment out setting the preferred node, the default policy is
>>>>> in action (I assume you did no change it, not for the process, not
>>>>> system wide) which is preferred also.
>>>>> But here you set preferred to a specific node and manual says, the
>>>>> default for process is to allocate on the node it runs.
>>>>> So I wonder what is the cpu affinity for this process looks like...
>>>>> Also maybe just to confirm you can check the policy from within your
>>>>> running code?
>>>>> 
>>>>> Can you also post the machine NUMA config?
>>>>> 
>>>>> --
>>>>> Elena
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Elena
>> 
>> 
> 
> 
> 
> -- 
> Elena

--
To unsubscribe from this list: send the line "unsubscribe linux-numa" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html