Re: [PATCH ibacm 1/4] ibacm: Allocate end-point addresses dynamically - v2

Håkon Bugge <haakon.bugge@xxxxxxxxxx> · Mon, 3 Dec 2018 17:14:03 +0100

Thanks for the review, Doug!

> On 3 Dec 2018, at 16:26, Doug Ledford <dledford@xxxxxxxxxx> wrote:
> 
> On Fri, 2018-11-23 at 16:40 +0100, Håkon Bugge wrote:
>> @@ -421,7 +421,7 @@ static void acm_mark_addr_invalid(struct acmc_ep *ep,
>> {
>> 	int i;
>> 
>> -	for (i = 0; i < MAX_EP_ADDR; i++) {
>> +	for (i = 0; i < ep->nmbr_ep_addrs; i++) {
>> 		if (!acm_addr_cmp(&ep->addr_info[i].addr, data->info.addr, data->type)) {
>> 			ep->addr_info[i].addr.type = ACM_ADDRESS_INVALID;
>> 			ep->port->prov->remove_address(ep->addr_info[i].prov_addr_context);
>> @@ -437,7 +437,7 @@ acm_addr_lookup(const struct acm_endpoint *endpoint, uint8_t *addr, uint8_t addr
>> 	int i;
>> 
>> 	ep = container_of(endpoint, struct acmc_ep, endpoint);
>> -	for (i = 0; i < MAX_EP_ADDR; i++)
>> +	for (i = 0; i < ep->nmbr_ep_addrs; i++)
>> 		if (!acm_addr_cmp(&ep->addr_info[i].addr, addr, addr_type))
>> 			return &ep->addr_info[i].addr;
>> 
>> @@ -838,7 +838,7 @@ acm_get_port_ep_address(struct acmc_port *port, struct acm_ep_addr_data *data)
>> 		if ((data->type == ACM_EP_INFO_PATH) &&
>> 		    (!data->info.path.pkey ||
>> 		     (be16toh(data->info.path.pkey) == ep->endpoint.pkey))) {
>> -			for (i = 0; i < MAX_EP_ADDR; i++) {
>> +			for (i = 0; i < ep->nmbr_ep_addrs; i++) {
>> 				if (ep->addr_info[i].addr.type)
>> 					return &ep->addr_info[i];
>> 			}
> 
> All of these loops over nmbr_ep_addrs worry me.  What happens if we are
> doing one of these loops while someone else is simultaneously removing
> addresses?  

The commit doesn't shrink the table, but your point is perfectly valid anyway. The addr_info pointer may change at any time. A read/write locking scheme is needed.

> I suggesting running cmtime with some large number of
> connections while simultaneously running a loop on another terminal that
> adds addresses up to 250 addresses and then removes addresses all the
> way back down to 1.

That is a good test. Will add this test.

>> static int acm_nl_is_valid_resolve_request(struct acm_nl_msg *acmnlmsg)
>> @@ -2018,12 +2073,21 @@ acm_ep_insert_addr(struct acmc_ep *ep, const char *name, uint8_t *addr,
>> 	memcpy(tmp, addr, acm_addr_len(addr_type));
>> 
>> 	if (!acm_addr_lookup(&ep->endpoint, addr, addr_type)) {
>> -		for (i = 0; (i < MAX_EP_ADDR) &&
>> +		for (i = 0; (i < ep->nmbr_ep_addrs) &&
>> 			    (ep->addr_info[i].addr.type != ACM_ADDRESS_INVALID); i++)
>> 			;
>> -		if (i == MAX_EP_ADDR) {
>> -			ret = ENOMEM;
>> -			goto out;
>> +		if (i == ep->nmbr_ep_addrs) {
>> +			++ep->nmbr_ep_addrs;
> 
> Don't increment this prior to the realloc succeeding.  You just created
> a race condition that can cause somewhere else in the code to access off
> the end of the array if they read this incremented counter while you
> still have the old addr_info pointer.  Always increase the array first,
> increment the array size counter second.  Do the opposite when shrinking
> the array.

I agree with your point. Holding the rw_lock may alleviate the issue here, but doing as you suggest doesn't hurt. Not sure there is a use-case for shrinking it. Either few addresses are used, the old maximum at 4 worked, or we have another use-case where many addresses are used, but not typically deleted.

>> +			ep->addr_info = realloc(ep->addr_info, ep->nmbr_ep_addrs * sizeof(*ep->addr_info));
>> +			if (!ep->addr_info) {
>> +				ret = ENOMEM;
>> +				--ep->nmbr_ep_addrs;
>> +				goto out;
> 
> You just leaked the old addr_info on a failed realloc because you saved
> into your original pointer, thereby overwriting it, and realloc does not
> free the old pointer on failure.

I sure did. Will fix.

> You have this same mistaken construct in patch 2/4.

Noted.

> In general, this patch, as it stands, looks very, very fragile to me. 
> The old method used a static array of a fixed size that never changed
> address, so all sorts of simultaneous accesses were safe, but in the new
> way of doing things that's no longer true.  

Safe in the sense it doesn't crash, but not safe in the sense that it could use incorrect data. Look at this hunk in acm_ep_insert_addr():

	ep->addr_info[i].addr.type = addr_type;  <== Now the entry is valid, but
	strncpy(ep->addr_info[i].string_buf, name, ACM_MAX_ADDRESS);
	memcpy(ep->addr_info[i].addr.info.addr, tmp, ACM_MAX_ADDRESS);  <=== address info copied here...

Well, the above is no excuse for me not doing the right thing ;-)

> Our addr_info pointer can
> change mid-way through one of those for loops.  And our max address
> count can also change mid-way through one of those for loops.  We have
> lots of those loops, so I think you are going to need some sort of
> locking around them.  Either a rwlock_t or something else.  I suspect
> the test I listed above will crash this new feature rather reliably.

I do not like this series either. Let me suggest; 1) I clean the commits up and 2), I go back to re-evaluate the use of Port GUID instead of Node GUID.

The reason for 2) is that I do not think the ibacm address file will change for most cases, which I presume is a single HCA with one of two ports, and only one address assigned per port. After all, the address file doesn't contain any GUIDs. If the address file doesn't change, my take is to abandon this series and use Port GUIDs instead. The fix, in case of incompatibility, would simply be to delete the address file before starting ibacm.

And me confused, since ibacm creates the address file if it is not there, why isn't created _every_ time ibacm is started? And if so, why is it needed at all?

Thxs, Håkon