Re: Is there a working cache for path record and lids etc for librdmacm?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Christopher,

On 11/17/20 11:57 AM, Christopher Lameter wrote:
We have a large number of apps running on the same host that are all
sending to the same set of hosts. Lots of requests for address resolution
are going to the SM and for a large set of hosts this can become too much
for the SM.

I have used ibacm successfully years ago (think somewhere in the
2013-2015 timeframe) but abandoned the approach because some
measurements indicated that using OpenMPI with rdmacm had a big
runtime overhead compared to using OpenMPI+oob (Mellanox was
informed but I'm unsure how much has changed until now)

Is there something that can locally cache the results of the SM queries to
avoid additional requests?

Not that I know of, but others might know better. Maybe try contacting
Sean Hefty (driver behind ibacm) directly if he missed your email here
on the list.

We have tried IBACM but the address resolution does not work on it. It is
unable to complete a request for any address resolution and leaves kernel
threads that never terminate instead.

Setting up ibacm was/is painful, maybe you could verify that it works on
a test bed with lowlevel rdmacm tools to debug with ping-pong, etc.

Furthermore, another thing I learned the hard way was that a cold cache
can overwhelm opensm as well. So, if you deploy ibacm, you have to make
sure that not too many requests go to the local ibacm on too many nodes
simultaneously right after starting ibacm service, otherwise having all
nodes sending numerous requests to opensm could timeout -> could be the
reason for your stalled kernel threads.

(another explanation is obviously a bug in ibacm and/or incompatibility
to newer versions of librdmacm or opensm or other IB libs)

Sorry, that I cannot provide more specific and direct help, but maybe my
pointers can help you solve the issue.

Best,
 Jens



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux