Re: [PATCH] ibacm: Compare logical partitions instead of pkeys

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[resent as plain-text]

> On 20 Dec 2018, at 04:03, Hal Rosenstock <hal@xxxxxxxxxxxxxxxxxx> wrote:
> 
> On 12/19/2018 3:08 PM, Håkon Bugge wrote:
>> Using ibacm in an environment with limited pkeys, it is unable to
>> obtain a Path Record (PR) in some situations.
>> 
>> The use of limited pkey implies communication over the logical
>> partition where one peer is a full member and the other is limited.
>> 
>> When comparing pkey values, the above difference will yield a
>> mismatch, even if the logical partitions of the two pkeys compared are
>> the very same.
>> 
>> Fixed by introducing a helper function which compares the logical
>> partitions.
>> 
>> Querying a PR on the peer which uses the full partition, we see
>> without this commit:
>> 
>> ib_acme -s 192.168.200.203 -d  192.168.200.200 -v
>> Service: /usr/local/var/run/ibacm-unix.sock
>> Destination: 192.168.200.200
>> Source: 192.168.200.203
>> Path information
>>  dgid: fe80::21:2800:1a1:7c97
>>  sgid: fe80::21:2800:1ce:ac18
>>  dlid: 243
>>  slid: 246
>>  flow label: 0x0
>>  hop limit: 0
>>  tclass: 0
>>  reversible: 1
>>  pkey: 0x2b4
>>  sl: 0
>>  mtu: 4
>>  rate: 7
>>  packet lifetime: 16
>> SA verification: failed Cannot assign requested address
>> 
>> The verbose ibacm log contains:
>> 
>> acm_get_ep_address: SLID(246) DLID(243)
>> acm_get_ep_address: notice - could not find SLID(246) DLID(243)
>> acm_svr_query_path: notice - could not find local end point address
>> 
>> With this commit, the query with verification has success and we see
>> in the log:
>> 
>> acm_get_ep_address: SLID(246) DLID(243)
>> acm_same_partition: pkey_a: 0x02b4 pkey_b: 0x82b4
>> []
>> acm_query_response: status 0x0
>> 
>> Another example is a kernel client that requests PRs over
>> netlink. Without this commit, the statistics counters show:
>> 
>> ib_acme -P col
>> /usr/local/var/run/ibacm-unix.sock
>> Error Count : 234
>> Resolve Count : 234
>> []
>> 
>> With this commit, we see:
>> 
>> ib_acme -P col
>> /usr/local/var/run/ibacm-unix.sock
>> Error Count : 0
>> Resolve Count : 228
>> No Data : 0
>> Addr Query Count : 0
>> Addr Cache Count : 0
>> Route Query Count : 1
>> Route Cache Count : 226
>> 
>> Signed-off-by: Håkon Bugge <haakon.bugge@xxxxxxxxxx>
>> ---
>> ibacm/src/acm.c | 13 ++++++++++---
>> 1 file changed, 10 insertions(+), 3 deletions(-)
>> 
>> diff --git a/ibacm/src/acm.c b/ibacm/src/acm.c
>> index b695d1bc..14936597 100644
>> --- a/ibacm/src/acm.c
>> +++ b/ibacm/src/acm.c
>> @@ -820,6 +820,13 @@ acm_is_path_from_port(struct acmc_port *port, struct ibv_path_record *path)
>> 	return 0;
>> }
>> 
>> +static bool acm_same_partition(uint16_t pkey_a, uint16_t pkey_b) {
>> +
>> +	acm_log(2, "pkey_a: 0x%04x pkey_b: 0x%04x\n", pkey_a, pkey_b);
>> +
>> +	return ((pkey_a | IB_PKEY_FULL_MEMBER) == (pkey_b | IB_PKEY_FULL_MEMBER));
> 
> Doesn't this allow limited member to talk to another limited member of
> the same partition ? At least one needs to be a full member.

The partition enforcement is done by the HCA ports (optionally also switch ports).

The old code did "pkey_a == pkey_b", so in this sense, nothing is changed. Further, this is aka, you get in a request from the peer, which is a full member, and you need to search the end-points to see if you can communicate. Through my testing (and the debug prints of the pkeys), I have seen all four permutations of limited/full for a/b.



Thxs, Håkon

> 
> -- Hal
> 
>> +}
>> +
>> static struct acmc_addr *
>> acm_get_port_ep_address(struct acmc_port *port, struct acm_ep_addr_data *data)
>> {
>> @@ -837,7 +844,7 @@ acm_get_port_ep_address(struct acmc_port *port, struct acm_ep_addr_data *data)
>> 	list_for_each(&port->ep_list, ep, entry) {
>> 		if ((data->type == ACM_EP_INFO_PATH) &&
>> 		    (!data->info.path.pkey ||
>> -		     (be16toh(data->info.path.pkey) == ep->endpoint.pkey))) {
>> +		     acm_same_partition(be16toh(data->info.path.pkey), ep->endpoint.pkey))) {
>> 			for (i = 0; i < MAX_EP_ADDR; i++) {
>> 				if (ep->addr_info[i].addr.type)
>> 					return &ep->addr_info[i];
>> @@ -2158,7 +2165,7 @@ static int acm_assign_ep_names(struct acmc_ep *ep)
>> 
>> 		if (!strcasecmp(dev_name, dev) &&
>> 		    (ep->port->port.port_num == (uint8_t) port) &&
>> -		    (ep->endpoint.pkey == pkey)) {
>> +		    acm_same_partition(ep->endpoint.pkey, pkey)) {
>> 			acm_log(1, "assigning %s\n", name);
>> 			if (acm_ep_insert_addr(ep, name, addr, type)) {
>> 				acm_log(1, "maximum number of names assigned to EP\n");
>> @@ -2179,7 +2186,7 @@ static struct acmc_ep *acm_find_ep(struct acmc_port *port, uint16_t pkey)
>> 	acm_log(2, "pkey 0x%x\n", pkey);
>> 
>> 	list_for_each(&port->ep_list, ep, entry) {
>> -		if (ep->endpoint.pkey == pkey) {
>> +		if (acm_same_partition(ep->endpoint.pkey, pkey)) {
>> 			res = ep;
>> 			break;
>> 		}




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux