Re: Understanding the allocation size of mlx5_alloc_buf

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 06, 2023 at 12:48:46PM +0000, Olaf.Krzikalla@xxxxxx wrote:
> Hi @all,
> 
> creating connections via create_qp fails on our cluster for rather small numbers of processes (128 is working, 256 not) due to an out-of-memory error. I've tracked down the issue to an mlx5_alloc_buf call, which allocates ~500kB per call, which seems to be a lot.
> 
> heaptrack tells me the following:
> 
> 34.47M peak memory consumed over 92 calls from
> mlx5_alloc_buf
>   in /usr/lib64/libibverbs/libmlx5-rdmav34.so
> 8.65M consumed over 16 calls from:
>     create_qp
>       in /usr/lib64/libibverbs/libmlx5-rdmav34.so
>     mlx5_create_qp
>       in /usr/lib64/libibverbs/libmlx5-rdmav34.so
> .
> 
> Can anyone help me to understand, what causes a 500kB allocation in create_qp? Maybe it is some sort of a configuration issue, which I can handle somehow.
> 
> Thanks for help and best regards
> Olaf Krzikalla
> 
> 
> System information:
> CentOS Linux 7 (Core)
> Linux 3.10.0-1160.88.1.el7.x86_64

Please contact your Nvidia support representative, you are talking about distro kernel
and not linux upstream.

Thanks

> CA 'mlx5_0'
>         CA type: MT4123
>         Number of ports: 1
>         Firmware version: 20.33.1048
>         Hardware version: 0
> 
> 
> 
> 
> 
> 



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux