From: Parav Pandit <parav@xxxxxxxxxxxx> There are few issues with validation of netdevice and listen id lookup for IB (IPoIB) while processing incoming CM request as below. 1. While performing lookup of bind_list in cma_ps_find(), net namespace of the netdevice can get deleted in cma_exit_net(), resulting in use after free access of idr and/or net namespace structures. This lookup occurs from the workqueue context (and not userspace context where net namespace is always valid). CPU0 CPU1 ==== ==== bind_list = cma_ps_find(); move netdevice to new namespace delete net namespace cma_exit_net() idr_destroy(idr); [..] cma_find_listener(bind_list, ..); 2. While netdevice is validated for IP address in given net namespace, netdevice's net namespace and/or ifindex can change in cma_get_net_dev() and cma_match_net_dev(). Above issues are overcome by using rcu lock along with netdevice UP/DOWN state as described below. When a net namespace is getting deleted, netdevice is closed and shutdown before moving it back to init_net namespace. change_net_namespace() synchronizes with any existing use of netdevice before changing the netdev properties such as net or ifindex. Once netdevice IFF_UP flags is cleared, such fields are not guaranteed to be valid. Therefore, rcu lock along with netdevice state check ensures that, while route lookup and cm_id lookup is in progress, netdevice of interest won't migrate to any other net namespace. This ensures that associated net namespace of netdevice won't get deleted while rcu lock is held for netdevice which is in IFF_UP state. Fixes: fa20105e09e9 ("IB/cma: Add support for network namespaces") Fixes: 4be74b42a6d0 ("IB/cma: Separate port allocation to network namespaces") Fixes: f887f2ac87c2 ("IB/cma: Validate routing of incoming requests") Signed-off-by: Parav Pandit <parav@xxxxxxxxxxxx> Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> --- drivers/infiniband/core/cma.c | 53 +++++++++++++++++++++++++++++++++++-------- 1 file changed, 43 insertions(+), 10 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 51a641002e10..8364223422d0 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -382,6 +382,8 @@ struct cma_hdr { #define CMA_VERSION 0x00 struct cma_req_info { + struct sockaddr_storage listen_addr_storage; + struct sockaddr_storage src_addr_storage; struct ib_device *device; int port; union ib_gid local_gid; @@ -1340,11 +1342,11 @@ static bool validate_net_dev(struct net_device *net_dev, } static struct net_device *cma_get_net_dev(struct ib_cm_event *ib_event, - const struct cma_req_info *req) + struct cma_req_info *req) { - struct sockaddr_storage listen_addr_storage, src_addr_storage; - struct sockaddr *listen_addr = (struct sockaddr *)&listen_addr_storage, - *src_addr = (struct sockaddr *)&src_addr_storage; + struct sockaddr *listen_addr = + (struct sockaddr *)&req->listen_addr_storage; + struct sockaddr *src_addr = (struct sockaddr *)&req->src_addr_storage; struct net_device *net_dev; const union ib_gid *gid = req->has_gid ? &req->local_gid : NULL; int err; @@ -1359,11 +1361,6 @@ static struct net_device *cma_get_net_dev(struct ib_cm_event *ib_event, if (!net_dev) return ERR_PTR(-ENODEV); - if (!validate_net_dev(net_dev, listen_addr, src_addr)) { - dev_put(net_dev); - return ERR_PTR(-EHOSTUNREACH); - } - return net_dev; } @@ -1490,15 +1487,51 @@ static struct rdma_id_private *cma_id_from_event(struct ib_cm_id *cm_id, } } + /* + * Net namespace might be getting deleted while route lookup, + * cm_id lookup is in progress. Therefore, perform netdevice + * validation, cm_id lookup under rcu lock. + * RCU lock along with netdevice state check, synchronizes with + * netdevice migrating to different net namespace and also avoids + * case where net namespace doesn't get deleted while lookup is in + * progress. + * If the device state is not IFF_UP, its properties such as ifindex + * and nd_net cannot be trusted to remain valid without rcu lock. + * net/core/dev.c change_net_namespace() ensures to synchronize with + * ongoing operations on net device after device is closed using + * synchronize_net(). + */ + rcu_read_lock(); + if (*net_dev) { + /* + * If netdevice is down, it is likely that it is administratively + * down or it might be migrating to different namespace. + * In that case avoid further processing, as the net namespace + * or ifindex may change. + */ + if (((*net_dev)->flags & IFF_UP) == 0) { + id_priv = ERR_PTR(-EHOSTUNREACH); + goto err; + } + + if (!validate_net_dev(*net_dev, + (struct sockaddr *)&req.listen_addr_storage, + (struct sockaddr *)&req.src_addr_storage)) { + id_priv = ERR_PTR(-EHOSTUNREACH); + goto err; + } + } + bind_list = cma_ps_find(*net_dev ? dev_net(*net_dev) : &init_net, rdma_ps_from_service_id(req.service_id), cma_port_from_service_id(req.service_id)); id_priv = cma_find_listener(bind_list, cm_id, ib_event, &req, *net_dev); +err: + rcu_read_unlock(); if (IS_ERR(id_priv) && *net_dev) { dev_put(*net_dev); *net_dev = NULL; } - return id_priv; } -- 2.14.3 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html