On Fri, 27 Nov 2020, Håkon Bugge wrote: > > Huh? When does it talk to a subnet manager (or the SA)? > > When resolving the route AND the option "route_prot" is set to "sa". If > set to "acm", what Hong describes above applies. My config has "route_prot" set to "sa" > > If its get an IP address of an IB node that does not have ibacm then it > > fails with a timeout ..... ? And leaves hanging kernel threads around by > > design? > > Nop, the kernel falls back and uses the neighbour cache instead. But ib_acme hangs? The main issue here is what the user space app does. And we need ibacm to cache user space address resolutions. > > So it only populates the cache from its local node information? > > No, if you use ibacm for address resolution the only protocol it has is > "acm", which means the information comes from a peer ibacm. > > If you talk about the cache for routes, it comes either from the SA or a > peer ibacm, depending on the "route_prot" setting. I have always run it with that setting. How can I debug this issue and how can we fix this? > > >> To resolve IPoIB address to PathRecord, you must: > >> 1) The IPoIB interface must UP and RUNNING on the client and target > >> side. > >> 2) The ibacm service must RUNNING on the client and target. > > > > That is working if you want to resolve only the IP addresses of the IB > > interfaces on the client and target. None else. > > That is why it is called IBacm, right? Huh? IBACM is an address resolution service for IB. Somehow that only includes addresses of hosts running IBACM? > > > Here is the description of ibacms function from the sources: > > > > "Conceptually, the ibacm service implements an ARP like protocol and > > either uses IB multicast records to construct path record data or queries > > the SA directly, depending on the selected route protocol. By default, the > > ibacm services uses and caches SA path record queries." > > > > SA queries dont work. So its broken and cannot talk to the SM. > > Why do you say that? It works all the time for me which uses "sa" as "route_prot". Not here and not in the tests that RH ran to verify the issue. "route_prot" set to "sa" is the default config for the Redhat release of IBACM. However, the addr_prot is set to "acm" by default. I set it to "sa" with no effect.