On Wed, Mar 06, 2019 at 07:18:19PM +0200, Leon Romanovsky wrote: > From: Jack Morgenstein <jackm@xxxxxxxxxxxxxxxxxx> > > Our testing revealed a use-after-free problem in the alias-guid > mechanism in the mlx4 driver (this driver is the sole user of the > ib sa aliasguid query). > > The problem is that the caller of ib_sa_guid_info_rec_query() stored > its return value (query id) in an id field in the callback context. > However, if the query callback function (which deletes the cb context) > completes before the query returns to the caller, we get a use-after-free. > > This is shown by the following stack trace: > > [2018-09-01 04:44:34] BUG: KASAN: use-after-free in set_guid_rec+0x806/0xc90 [mlx4_ib] > [2018-09-01 04:44:34] Write of size 4 at addr ffff881049a3fc00 by task kworker/u48:7/14578 > [2018-09-01 04:44:34] > [2018-09-01 04:44:34] CPU: 9 PID: 14578 Comm: kworker/u48:7 Tainted: G OE 4.19.0-rc1-for-upstream-dbg-2018-08-31_15-07-54-20 #1 > [2018-09-01 04:44:34] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a 07/09/2015 > [2018-09-01 04:44:34] Workqueue: alias_guid0 alias_guid_work [mlx4_ib] > [2018-09-01 04:44:34] Call Trace: > [2018-09-01 04:44:34] dump_stack+0x9a/0xeb > [2018-09-01 04:44:34] print_address_description+0xe3/0x2e0 > [2018-09-01 04:44:34] kasan_report+0x18a/0x2e0 > [2018-09-01 04:44:34] ? set_guid_rec+0x806/0xc90 [mlx4_ib] > [2018-09-01 04:44:34] set_guid_rec+0x806/0xc90 [mlx4_ib] > [2018-09-01 04:44:34] ? invalidate_guid_record+0x450/0x450 [mlx4_ib] > [2018-09-01 04:44:34] ? _raw_spin_unlock_irqrestore+0x59/0x70 > [2018-09-01 04:44:34] alias_guid_work+0x4da/0x950 [mlx4_ib] > [2018-09-01 04:44:34] process_one_work+0x912/0x1830 > [2018-09-01 04:44:34] ? wq_pool_ids_show+0x310/0x310 > [2018-09-01 04:44:34] ? lock_acquire+0x145/0x3a0 > [2018-09-01 04:44:34] worker_thread+0x87/0xbb0 > [2018-09-01 04:44:34] ? process_one_work+0x1830/0x1830 > [2018-09-01 04:44:34] kthread+0x322/0x3e0 > [2018-09-01 04:44:34] ? kthread_create_worker_on_cpu+0xc0/0xc0 > [2018-09-01 04:44:34] ret_from_fork+0x3a/0x50 > [2018-09-01 04:44:34] > [2018-09-01 04:44:34] Allocated by task 14578: > [2018-09-01 04:44:34] kasan_kmalloc+0xa0/0xd0 > [2018-09-01 04:44:34] kmem_cache_alloc_trace+0x160/0x370 > [2018-09-01 04:44:34] set_guid_rec+0x2bb/0xc90 [mlx4_ib] > [2018-09-01 04:44:34] alias_guid_work+0x4da/0x950 [mlx4_ib] > [2018-09-01 04:44:34] process_one_work+0x912/0x1830 > [2018-09-01 04:44:34] ? wq_pool_ids_show+0x310/0x310 > [2018-09-01 04:44:34] ? lock_acquire+0x145/0x3a0 > [2018-09-01 04:44:34] worker_thread+0x87/0xbb0 > [2018-09-01 04:44:34] ? process_one_work+0x1830/0x1830 > [2018-09-01 04:44:34] kthread+0x322/0x3e0 > [2018-09-01 04:44:34] ? kthread_create_worker_on_cpu+0xc0/0xc0 > [2018-09-01 04:44:34] ret_from_fork+0x3a/0x50 > [2018-09-01 04:44:34] > [2018-09-01 04:44:34] Allocated by task 14578: > [2018-09-01 04:44:34] kasan_kmalloc+0xa0/0xd0 > [2018-09-01 04:44:34] kmem_cache_alloc_trace+0x160/0x370 > [2018-09-01 04:44:34] set_guid_rec+0x2bb/0xc90 [mlx4_ib] > [2018-09-01 04:44:34] alias_guid_work+0x4da/0x950 [mlx4_ib] > [2018-09-01 04:44:34] process_one_work+0x912/0x1830 > [2018-09-01 04:44:34] worker_thread+0x87/0xbb0 > [2018-09-01 04:44:34] kthread+0x322/0x3e0 > [2018-09-01 04:44:34] ret_from_fork+0x3a/0x50 > [2018-09-01 04:44:34] > [2018-09-01 04:44:34] Freed by task 14312: > [2018-09-01 04:44:34] __kasan_slab_free+0x11d/0x160 > [2018-09-01 04:44:34] kfree+0xf5/0x2f0 > [2018-09-01 04:44:34] aliasguid_query_handler+0x620/0x1980 [mlx4_ib] > [2018-09-01 04:44:34] ib_sa_guidinfo_rec_callback+0x158/0x190 [ib_core] > [2018-09-01 04:44:34] recv_handler+0x13a/0x180 [ib_core] > [2018-09-01 04:44:34] ib_mad_recv_done+0x1aea/0x2c20 [ib_core] > [2018-09-01 04:44:34] __ib_process_cq+0xf4/0x1e0 [ib_core] > [2018-09-01 04:44:34] ib_cq_poll_work+0x45/0x100 [ib_core] > [2018-09-01 04:44:34] process_one_work+0x912/0x1830 > [2018-09-01 04:44:34] worker_thread+0x87/0xbb0 > [2018-09-01 04:44:34] kthread+0x322/0x3e0 > [2018-09-01 04:44:34] ret_from_fork+0x3a/0x50 > > This problem is fixed by a patch to sa_query which allows the > alias-guid mechanism to control the sa_query's query buf lifetime > (by having the caller allocate/delete the sa query buffer). > > This allows the query buffer to be available for supplying the > query id at the point it is needed, and then deleted by the alias > guid mechanism after it obtains the query id. Thus, the query_id > returned by the call to ib_sa_guid_info_rec_query() does not need > to be stored. > > Fixes: a0c64a17aba8 ("mlx4: Add alias_guid mechanism") > Fixes: aeab97ed1503 ("IB/sa: Add GuidInfoRecord query support") > Signed-off-by: Jack Morgenstein <jackm@xxxxxxxxxxxxxxxxxx> > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> > drivers/infiniband/core/sa_query.c | 84 ++++++++++--------------- > drivers/infiniband/hw/mlx4/alias_GUID.c | 62 +++++++++--------- > include/rdma/ib_sa.h | 26 +++++++- > 3 files changed, 89 insertions(+), 83 deletions(-) > > diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c > index 7925e45ea88a..6c221ec91964 100644 > +++ b/drivers/infiniband/core/sa_query.c > @@ -104,21 +104,6 @@ struct ib_sa_device { > struct ib_sa_port port[0]; > }; > > -struct ib_sa_query { > - void (*callback)(struct ib_sa_query *, int, struct ib_sa_mad *); > - void (*release)(struct ib_sa_query *); > - struct ib_sa_client *client; > - struct ib_sa_port *port; > - struct ib_mad_send_buf *mad_buf; > - struct ib_sa_sm_ah *sm_ah; > - int id; > - u32 flags; > - struct list_head list; /* Local svc request list */ > - u32 seq; /* Local svc request sequence number */ > - unsigned long timeout; /* Local svc timeout */ > - u8 path_use; /* How will the pathrecord be used */ > -}; > - > #define IB_SA_ENABLE_LOCAL_SERVICE 0x00000001 > #define IB_SA_CANCEL 0x00000002 > #define IB_SA_QUERY_OPA 0x00000004 > @@ -136,12 +121,6 @@ struct ib_sa_path_query { > struct sa_path_rec *conv_pr; > }; > > -struct ib_sa_guidinfo_query { > - void (*callback)(int, struct ib_sa_guidinfo_rec *, void *); > - void *context; > - struct ib_sa_query sa_query; > -}; > - > struct ib_sa_classport_info_query { > void (*callback)(void *); > void *context; > @@ -1871,23 +1850,26 @@ static void ib_sa_guidinfo_rec_callback(struct ib_sa_query *sa_query, > int status, > struct ib_sa_mad *mad) > { > - struct ib_sa_guidinfo_query *query = > - container_of(sa_query, struct ib_sa_guidinfo_query, sa_query); > - > if (mad) { > struct ib_sa_guidinfo_rec rec; > > ib_unpack(guidinfo_rec_table, ARRAY_SIZE(guidinfo_rec_table), > mad->data, &rec); > - query->callback(status, &rec, query->context); > + sa_query->guidinfo_cb(status, &rec, sa_query); > } else > - query->callback(status, NULL, query->context); > + sa_query->guidinfo_cb(status, NULL, sa_query); > } > > -static void ib_sa_guidinfo_rec_release(struct ib_sa_query *sa_query) > +/** > + * ib_sa_get_query_id() - Returns the sa query id > + * @sa_query: Pointer to the sa_query structure containing id > + * Return: id of the query > + */ > +int ib_sa_get_query_id(struct ib_sa_query *sa_query) > { > - kfree(container_of(sa_query, struct ib_sa_guidinfo_query, sa_query)); > + return sa_query->id; > } > +EXPORT_SYMBOL(ib_sa_get_query_id); > > int ib_sa_guid_info_rec_query(struct ib_sa_client *client, > struct ib_device *device, u8 port_num, > @@ -1896,11 +1878,10 @@ int ib_sa_guid_info_rec_query(struct ib_sa_client *client, > unsigned long timeout_ms, gfp_t gfp_mask, > void (*callback)(int status, > struct ib_sa_guidinfo_rec *resp, > - void *context), > + struct ib_sa_query *sa_query), > void *context, > - struct ib_sa_query **sa_query) > + struct ib_sa_query *sa_query) > { > - struct ib_sa_guidinfo_query *query; > struct ib_sa_device *sa_dev = ib_get_client_data(device, &sa_client); > struct ib_sa_port *port; > struct ib_mad_agent *agent; > @@ -1919,25 +1900,27 @@ int ib_sa_guid_info_rec_query(struct ib_sa_client *client, > port = &sa_dev->port[port_num - sa_dev->start_port]; > agent = port->agent; > > - query = kzalloc(sizeof(*query), gfp_mask); > - if (!query) > - return -ENOMEM; > - > - query->sa_query.port = port; > - ret = alloc_mad(&query->sa_query, gfp_mask); > + sa_query->port = port; > + ret = alloc_mad(sa_query, gfp_mask); > if (ret) > goto err1; > > ib_sa_client_get(client); > - query->sa_query.client = client; > - query->callback = callback; > - query->context = context; > + sa_query->client = client; > > - mad = query->sa_query.mad_buf->mad; > - init_mad(&query->sa_query, agent); > + mad = sa_query->mad_buf->mad; > + init_mad(sa_query, agent); > > - query->sa_query.callback = callback ? ib_sa_guidinfo_rec_callback : NULL; > - query->sa_query.release = ib_sa_guidinfo_rec_release; > + sa_query->callback = ib_sa_guidinfo_rec_callback; > + sa_query->guidinfo_cb = callback; > + > + /* > + * The query buffer is part of the callback context, > + * which gets freed by the caller (in aliasGUID.c). > + * Thus, we do not release the query buffer in the > + * sa_query module. > + */ > + sa_query->release = NULL; > > mad->mad_hdr.method = method; > mad->mad_hdr.attr_id = cpu_to_be16(IB_SA_ATTR_GUID_INFO_REC); > @@ -1946,21 +1929,17 @@ int ib_sa_guid_info_rec_query(struct ib_sa_client *client, > ib_pack(guidinfo_rec_table, ARRAY_SIZE(guidinfo_rec_table), rec, > mad->data); > > - *sa_query = &query->sa_query; > - > - ret = send_mad(&query->sa_query, timeout_ms, gfp_mask); > + ret = send_mad(sa_query, timeout_ms, gfp_mask); > if (ret < 0) > goto err2; > > return ret; > > err2: > - *sa_query = NULL; > - ib_sa_client_put(query->sa_query.client); > - free_mad(&query->sa_query); > + ib_sa_client_put(sa_query->client); > + free_mad(sa_query); > > err1: > - kfree(query); > return ret; > } > EXPORT_SYMBOL(ib_sa_guid_info_rec_query); > @@ -2195,7 +2174,8 @@ static void send_handler(struct ib_mad_agent *agent, > free_mad(query); > if (query->client) > ib_sa_client_put(query->client); > - query->release(query); > + if (query->release) > + query->release(query); > } > > static void recv_handler(struct ib_mad_agent *mad_agent, > diff --git a/drivers/infiniband/hw/mlx4/alias_GUID.c b/drivers/infiniband/hw/mlx4/alias_GUID.c > index 2a0b59a4b6eb..a70613fc65cd 100644 > +++ b/drivers/infiniband/hw/mlx4/alias_GUID.c > @@ -53,9 +53,8 @@ Whenever we receive an smp mad GUIDInfo record, the data will be cached. > struct mlx4_alias_guid_work_context { > u8 port; > struct mlx4_ib_dev *dev ; > - struct ib_sa_query *sa_query; > + struct ib_sa_query sa_query; > struct completion done; > - int query_id; > struct list_head list; > int block_num; > ib_sa_comp_mask guid_indexes; > @@ -287,10 +286,10 @@ void mlx4_ib_notify_slaves_on_guid_change(struct mlx4_ib_dev *dev, > > static void aliasguid_query_handler(int status, > struct ib_sa_guidinfo_rec *guid_rec, > - void *context) > + struct ib_sa_query *sa_query) > { > struct mlx4_ib_dev *dev; > - struct mlx4_alias_guid_work_context *cb_ctx = context; > + struct mlx4_alias_guid_work_context *cb_ctx; > u8 port_index ; > int i; > struct mlx4_sriov_alias_guid_info_rec_det *rec; > @@ -299,9 +298,11 @@ static void aliasguid_query_handler(int status, > ib_sa_comp_mask applied_guid_indexes = 0; > unsigned int resched_delay_sec = 0; > > - if (!context) > + if (!sa_query) > return; > > + cb_ctx = container_of(sa_query, struct mlx4_alias_guid_work_context, > + sa_query); > dev = cb_ctx->dev; > port_index = cb_ctx->port - 1; > rec = &dev->sriov.alias_guid.ports_guid[port_index]. > @@ -440,11 +441,17 @@ static void aliasguid_query_handler(int status, > alias_guid_work, > msecs_to_jiffies(resched_delay_sec * 1000)); > } > - if (cb_ctx->sa_query) { > - list_del(&cb_ctx->list); > + if (!list_empty(&cb_ctx->list)) { > + /* we own the callback context */ > + list_del_init(&cb_ctx->list); > kfree(cb_ctx); > - } else > + } else { > + /* > + * mlx4_ib_destroy_alias_guid_service owns callback ctx > + * It is waiting for completion before freeing it > + */ > complete(&cb_ctx->done); > + } > spin_unlock_irqrestore(&dev->sriov.alias_guid.ag_work_lock, flags1); > spin_unlock_irqrestore(&dev->sriov.going_down_lock, flags); > } > @@ -514,7 +521,7 @@ static int set_guid_rec(struct ib_device *ibdev, > goto new_schedule; > } > > - callback_context = kmalloc(sizeof *callback_context, GFP_KERNEL); > + callback_context = kzalloc(sizeof(*callback_context), GFP_KERNEL); > if (!callback_context) { > err = -ENOMEM; > resched_delay = HZ * 5; > @@ -541,19 +548,17 @@ static int set_guid_rec(struct ib_device *ibdev, > list_add_tail(&callback_context->list, head); > spin_unlock_irqrestore(&dev->sriov.alias_guid.ag_work_lock, flags1); > > - callback_context->query_id = > - ib_sa_guid_info_rec_query(dev->sriov.alias_guid.sa_client, > - ibdev, port, &guid_info_rec, > - comp_mask, rec->method, 1000, > - GFP_KERNEL, aliasguid_query_handler, > - callback_context, > - &callback_context->sa_query); > - if (callback_context->query_id < 0) { > - pr_debug("ib_sa_guid_info_rec_query failed, query_id: " > - "%d. will reschedule to the next 1 sec.\n", > - callback_context->query_id); > + err = ib_sa_guid_info_rec_query(dev->sriov.alias_guid.sa_client, > + ibdev, port, &guid_info_rec, > + comp_mask, rec->method, 1000, > + GFP_KERNEL, aliasguid_query_handler, > + callback_context, > + &callback_context->sa_query); > + if (err < 0) { > + pr_debug("ib_sa_guid_info_rec_query failed, err %d. Retrying in 1 sec.\n", > + err); > spin_lock_irqsave(&dev->sriov.alias_guid.ag_work_lock, flags1); > - list_del(&callback_context->list); > + list_del_init(&callback_context->list); > kfree(callback_context); > spin_unlock_irqrestore(&dev->sriov.alias_guid.ag_work_lock, flags1); > resched_delay = 1 * HZ; > @@ -800,7 +805,6 @@ void mlx4_ib_destroy_alias_guid_service(struct mlx4_ib_dev *dev) > struct mlx4_ib_sriov *sriov = &dev->sriov; > struct mlx4_alias_guid_work_context *cb_ctx; > struct mlx4_sriov_alias_guid_port_rec_det *det; > - struct ib_sa_query *sa_query; > unsigned long flags; > > for (i = 0 ; i < dev->num_ports; i++) { > @@ -811,14 +815,16 @@ void mlx4_ib_destroy_alias_guid_service(struct mlx4_ib_dev *dev) > cb_ctx = list_entry(det->cb_list.next, > struct mlx4_alias_guid_work_context, > list); > - sa_query = cb_ctx->sa_query; > - cb_ctx->sa_query = NULL; > - list_del(&cb_ctx->list); > - spin_unlock_irqrestore(&sriov->alias_guid.ag_work_lock, flags); > - ib_sa_cancel_query(cb_ctx->query_id, sa_query); > + /* own callback context by removing from list */ > + list_del_init(&cb_ctx->list); > + spin_unlock_irqrestore(&sriov->alias_guid.ag_work_lock, > + flags); > + ib_sa_cancel_query(ib_sa_get_query_id(&cb_ctx->sa_query), > + &cb_ctx->sa_query); Lets have new APIs that make sense, instead of adding a 'ib_sa_get_query_id' lets do: // Only for use by aliasGUID sa_query ib_sa_cancel_query_ptr(&cb_ctx->sa_query); > wait_for_completion(&cb_ctx->done); This doesn't seem right.. ib_sa_cancel_query is supposed to be a full fence, once it returns the handler is not running and won't run again, so we can just go ahead and do kfree without any more stuff (ie the SA code should be handling the locking that this completion is trying to create) Looks like ib_sa_cancel_query is broken in this regard, so it needs some fixing. It doesn't synchronize against running send_handler and it continues to touch query after it returns via ib_nl_request_timeout. Jason