Re: [PATCH] ibacm: Compare logical partitions instead of pkeys

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/20/2018 2:08 AM, Håkon Bugge wrote:
> [resent as plain-text]
> 
>> On 20 Dec 2018, at 04:03, Hal Rosenstock <hal@xxxxxxxxxxxxxxxxxx> wrote:
>>
>> On 12/19/2018 3:08 PM, Håkon Bugge wrote:
>>> Using ibacm in an environment with limited pkeys, it is unable to
>>> obtain a Path Record (PR) in some situations.
>>>
>>> The use of limited pkey implies communication over the logical
>>> partition where one peer is a full member and the other is limited.
>>>
>>> When comparing pkey values, the above difference will yield a
>>> mismatch, even if the logical partitions of the two pkeys compared are
>>> the very same.
>>>
>>> Fixed by introducing a helper function which compares the logical
>>> partitions.
>>>
>>> Querying a PR on the peer which uses the full partition, we see
>>> without this commit:
>>>
>>> ib_acme -s 192.168.200.203 -d  192.168.200.200 -v
>>> Service: /usr/local/var/run/ibacm-unix.sock
>>> Destination: 192.168.200.200
>>> Source: 192.168.200.203
>>> Path information
>>>  dgid: fe80::21:2800:1a1:7c97
>>>  sgid: fe80::21:2800:1ce:ac18
>>>  dlid: 243
>>>  slid: 246
>>>  flow label: 0x0
>>>  hop limit: 0
>>>  tclass: 0
>>>  reversible: 1
>>>  pkey: 0x2b4
>>>  sl: 0
>>>  mtu: 4
>>>  rate: 7
>>>  packet lifetime: 16
>>> SA verification: failed Cannot assign requested address
>>>
>>> The verbose ibacm log contains:
>>>
>>> acm_get_ep_address: SLID(246) DLID(243)
>>> acm_get_ep_address: notice - could not find SLID(246) DLID(243)
>>> acm_svr_query_path: notice - could not find local end point address
>>>
>>> With this commit, the query with verification has success and we see
>>> in the log:
>>>
>>> acm_get_ep_address: SLID(246) DLID(243)
>>> acm_same_partition: pkey_a: 0x02b4 pkey_b: 0x82b4
>>> []
>>> acm_query_response: status 0x0
>>>
>>> Another example is a kernel client that requests PRs over
>>> netlink. Without this commit, the statistics counters show:
>>>
>>> ib_acme -P col
>>> /usr/local/var/run/ibacm-unix.sock
>>> Error Count : 234
>>> Resolve Count : 234
>>> []
>>>
>>> With this commit, we see:
>>>
>>> ib_acme -P col
>>> /usr/local/var/run/ibacm-unix.sock
>>> Error Count : 0
>>> Resolve Count : 228
>>> No Data : 0
>>> Addr Query Count : 0
>>> Addr Cache Count : 0
>>> Route Query Count : 1
>>> Route Cache Count : 226
>>>
>>> Signed-off-by: Håkon Bugge <haakon.bugge@xxxxxxxxxx>
>>> ---
>>> ibacm/src/acm.c | 13 ++++++++++---
>>> 1 file changed, 10 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/ibacm/src/acm.c b/ibacm/src/acm.c
>>> index b695d1bc..14936597 100644
>>> --- a/ibacm/src/acm.c
>>> +++ b/ibacm/src/acm.c
>>> @@ -820,6 +820,13 @@ acm_is_path_from_port(struct acmc_port *port, struct ibv_path_record *path)
>>> 	return 0;
>>> }
>>>
>>> +static bool acm_same_partition(uint16_t pkey_a, uint16_t pkey_b) {
>>> +
>>> +	acm_log(2, "pkey_a: 0x%04x pkey_b: 0x%04x\n", pkey_a, pkey_b);
>>> +
>>> +	return ((pkey_a | IB_PKEY_FULL_MEMBER) == (pkey_b | IB_PKEY_FULL_MEMBER));
>>
>> Doesn't this allow limited member to talk to another limited member of
>> the same partition ? At least one needs to be a full member.
> 
> The partition enforcement is done by the HCA ports (optionally also switch ports).

Yes but this mechanism is meant for errors and this is introducing it as
a normal case. SA will not return path between 2 limited members of same
partition so that case is not supposed to occur. Also, it will cause bad
P_Key traps by HCA/switch to SM for every such packet and this is a bad
side effect.

-- Hal

> The old code did "pkey_a == pkey_b", so in this sense, nothing is changed. Further, this is aka, you get in a request from the peer, which is a full member, and you need to search the end-points to see if you can communicate. Through my testing (and the debug prints of the pkeys), I have seen all four permutations of limited/full for a/b.

> 
> 
> 
> Thxs, Håkon
> 
>>
>> -- Hal
>>
>>> +}
>>> +
>>> static struct acmc_addr *
>>> acm_get_port_ep_address(struct acmc_port *port, struct acm_ep_addr_data *data)
>>> {
>>> @@ -837,7 +844,7 @@ acm_get_port_ep_address(struct acmc_port *port, struct acm_ep_addr_data *data)
>>> 	list_for_each(&port->ep_list, ep, entry) {
>>> 		if ((data->type == ACM_EP_INFO_PATH) &&
>>> 		    (!data->info.path.pkey ||
>>> -		     (be16toh(data->info.path.pkey) == ep->endpoint.pkey))) {
>>> +		     acm_same_partition(be16toh(data->info.path.pkey), ep->endpoint.pkey))) {
>>> 			for (i = 0; i < MAX_EP_ADDR; i++) {
>>> 				if (ep->addr_info[i].addr.type)
>>> 					return &ep->addr_info[i];
>>> @@ -2158,7 +2165,7 @@ static int acm_assign_ep_names(struct acmc_ep *ep)
>>>
>>> 		if (!strcasecmp(dev_name, dev) &&
>>> 		    (ep->port->port.port_num == (uint8_t) port) &&
>>> -		    (ep->endpoint.pkey == pkey)) {
>>> +		    acm_same_partition(ep->endpoint.pkey, pkey)) {
>>> 			acm_log(1, "assigning %s\n", name);
>>> 			if (acm_ep_insert_addr(ep, name, addr, type)) {
>>> 				acm_log(1, "maximum number of names assigned to EP\n");
>>> @@ -2179,7 +2186,7 @@ static struct acmc_ep *acm_find_ep(struct acmc_port *port, uint16_t pkey)
>>> 	acm_log(2, "pkey 0x%x\n", pkey);
>>>
>>> 	list_for_each(&port->ep_list, ep, entry) {
>>> -		if (ep->endpoint.pkey == pkey) {
>>> +		if (acm_same_partition(ep->endpoint.pkey, pkey)) {
>>> 			res = ep;
>>> 			break;
>>> 		}
> 



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux