Re: [PATCH for-rc] IB/mlx4: Fix leak in id_map_find_del

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On 20 Jan 2020, at 16:06, Leon Romanovsky <leon@xxxxxxxxxx> wrote:
> 
> On Fri, Jan 17, 2020 at 02:56:22PM +0100, Håkon Bugge wrote:
>> Using CX-3 virtual functions, either from a bare-metal machine or
>> pass-through from a VM, MAD packets are proxied through the PF driver.
>> 
>> Since the VF drivers have separate name spaces for MAD Transaction Ids
>> (TIDs), the PF driver has to re-map the TIDs and keep the book keeping
>> in a cache.
>> 
>> Following the RDMA Connection Manager (CM) protocol, it is clear when
>> an entry has to evicted from the cache. When a DREP is sent from
>> mlx4_ib_multiplex_cm_handler(), id_map_find_del() is called. Similar
>> when a REJ is received by the mlx4_ib_demux_cm_handler(),
>> id_map_find_del() is called.
>> 
>> This function wipes out the TID in use from the IDR or XArray and
>> removes the id_map_entry from the table.
>> 
>> In short, it does everything except the topping of the cake, which is
>> to remove the entry from the list and free it. In other words, for the
>> DREP and REJ cases enumerated above, both will leak one id_map_entry.
>> 
>> Signed-off-by: Håkon Bugge <haakon.bugge@xxxxxxxxxx>
>> ---
>> drivers/infiniband/hw/mlx4/cm.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
>> index ecd6cadd529a..1df6d3ccfc62 100644
>> --- a/drivers/infiniband/hw/mlx4/cm.c
>> +++ b/drivers/infiniband/hw/mlx4/cm.c
>> @@ -197,8 +197,13 @@ static void id_map_find_del(struct ib_device *ibdev, int pv_cm_id)
>> 	if (!ent)
>> 		goto out;
>> 	found_ent = id_map_find_by_sl_id(ibdev, ent->slave_id, ent->sl_cm_id);
>> -	if (found_ent && found_ent == ent)
>> +	if (found_ent && found_ent == ent) {
>> 		rb_erase(&found_ent->node, sl_id_map);
>> +		if (!ent->scheduled_delete) {
> 
> Why do we need to check scheduled_delete?

1. Node receives a DREQ and mlx4_ib_demux_cm_handler() is called, which again calls schedule_delayed(), which sets scheduled_delete.

2. DREQ is proxied over to the VM, which replies with a DREP.

3. The DREP is proxied over to the PF driver, mlx4_ib_multiplex_cm_handler() is called, id_map_find_del() is called. If it is freed now (without checking scheduled_delete), it will be a double free when the delayed work kicks in.

But this documents that the commit message is not accurate, it is only the REJ case that has a leak.

> Isn't this to mark call to timeout cleanup (id_map_ent_timeout), which
> can't race with id_map_find_del()? They both hold the same spinlock.

No race, but it can be set as per the above.

If you agree, I will send a v2 with corrected commit message.


Thxs, Håkon


> 
> Thanks
> 
>> +			list_del(&ent->list);
>> +			kfree(ent);
>> +		}
>> +	}
>> out:
>> 	spin_unlock(&sriov->id_map_lock);
>> }
>> --
>> 2.20.1
>> 





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux