On 12/19/2018 3:08 PM, Håkon Bugge wrote: > Using ibacm in an environment with limited pkeys, it is unable to > obtain a Path Record (PR) in some situations. > > The use of limited pkey implies communication over the logical > partition where one peer is a full member and the other is limited. > > When comparing pkey values, the above difference will yield a > mismatch, even if the logical partitions of the two pkeys compared are > the very same. > > Fixed by introducing a helper function which compares the logical > partitions. > > Querying a PR on the peer which uses the full partition, we see > without this commit: > > ib_acme -s 192.168.200.203 -d 192.168.200.200 -v > Service: /usr/local/var/run/ibacm-unix.sock > Destination: 192.168.200.200 > Source: 192.168.200.203 > Path information > dgid: fe80::21:2800:1a1:7c97 > sgid: fe80::21:2800:1ce:ac18 > dlid: 243 > slid: 246 > flow label: 0x0 > hop limit: 0 > tclass: 0 > reversible: 1 > pkey: 0x2b4 > sl: 0 > mtu: 4 > rate: 7 > packet lifetime: 16 > SA verification: failed Cannot assign requested address > > The verbose ibacm log contains: > > acm_get_ep_address: SLID(246) DLID(243) > acm_get_ep_address: notice - could not find SLID(246) DLID(243) > acm_svr_query_path: notice - could not find local end point address > > With this commit, the query with verification has success and we see > in the log: > > acm_get_ep_address: SLID(246) DLID(243) > acm_same_partition: pkey_a: 0x02b4 pkey_b: 0x82b4 > [] > acm_query_response: status 0x0 > > Another example is a kernel client that requests PRs over > netlink. Without this commit, the statistics counters show: > > ib_acme -P col > /usr/local/var/run/ibacm-unix.sock > Error Count : 234 > Resolve Count : 234 > [] > > With this commit, we see: > > ib_acme -P col > /usr/local/var/run/ibacm-unix.sock > Error Count : 0 > Resolve Count : 228 > No Data : 0 > Addr Query Count : 0 > Addr Cache Count : 0 > Route Query Count : 1 > Route Cache Count : 226 > > Signed-off-by: Håkon Bugge <haakon.bugge@xxxxxxxxxx> > --- > ibacm/src/acm.c | 13 ++++++++++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/ibacm/src/acm.c b/ibacm/src/acm.c > index b695d1bc..14936597 100644 > --- a/ibacm/src/acm.c > +++ b/ibacm/src/acm.c > @@ -820,6 +820,13 @@ acm_is_path_from_port(struct acmc_port *port, struct ibv_path_record *path) > return 0; > } > > +static bool acm_same_partition(uint16_t pkey_a, uint16_t pkey_b) { > + > + acm_log(2, "pkey_a: 0x%04x pkey_b: 0x%04x\n", pkey_a, pkey_b); > + > + return ((pkey_a | IB_PKEY_FULL_MEMBER) == (pkey_b | IB_PKEY_FULL_MEMBER)); Doesn't this allow limited member to talk to another limited member of the same partition ? At least one needs to be a full member. -- Hal > +} > + > static struct acmc_addr * > acm_get_port_ep_address(struct acmc_port *port, struct acm_ep_addr_data *data) > { > @@ -837,7 +844,7 @@ acm_get_port_ep_address(struct acmc_port *port, struct acm_ep_addr_data *data) > list_for_each(&port->ep_list, ep, entry) { > if ((data->type == ACM_EP_INFO_PATH) && > (!data->info.path.pkey || > - (be16toh(data->info.path.pkey) == ep->endpoint.pkey))) { > + acm_same_partition(be16toh(data->info.path.pkey), ep->endpoint.pkey))) { > for (i = 0; i < MAX_EP_ADDR; i++) { > if (ep->addr_info[i].addr.type) > return &ep->addr_info[i]; > @@ -2158,7 +2165,7 @@ static int acm_assign_ep_names(struct acmc_ep *ep) > > if (!strcasecmp(dev_name, dev) && > (ep->port->port.port_num == (uint8_t) port) && > - (ep->endpoint.pkey == pkey)) { > + acm_same_partition(ep->endpoint.pkey, pkey)) { > acm_log(1, "assigning %s\n", name); > if (acm_ep_insert_addr(ep, name, addr, type)) { > acm_log(1, "maximum number of names assigned to EP\n"); > @@ -2179,7 +2186,7 @@ static struct acmc_ep *acm_find_ep(struct acmc_port *port, uint16_t pkey) > acm_log(2, "pkey 0x%x\n", pkey); > > list_for_each(&port->ep_list, ep, entry) { > - if (ep->endpoint.pkey == pkey) { > + if (acm_same_partition(ep->endpoint.pkey, pkey)) { > res = ep; > break; > } >