Hi Leon, > -----Original Message----- > From: linux-rdma-owner@xxxxxxxxxxxxxxx [mailto:linux-rdma- > owner@xxxxxxxxxxxxxxx] On Behalf Of Leon Romanovsky > Sent: Tuesday, March 20, 2018 8:46 AM > To: Doug Ledford <dledford@xxxxxxxxxx>; Jason Gunthorpe > <jgg@xxxxxxxxxxxx> > Cc: Leon Romanovsky <leonro@xxxxxxxxxxxx>; RDMA mailing list <linux- > rdma@xxxxxxxxxxxxxxx>; Mark Bloch <markb@xxxxxxxxxxxx> > Subject: [PATCH rdma-rc 1/2] RDMA/restrack: Add ability to create non- > traceable restrack objects > > From: Leon Romanovsky <leonro@xxxxxxxxxxxx> > > The fact that Resource tracking 02d8883f520e ("RDMA/restrack: Add general > infrastructure to track RDMA resources") was added immediately after commit > 16c1975f1032 ("IB/mlx5: Create profile infrastructure to add and remove > stages") caused to miss the fact that PD and CQ should be skipped from > resources tracking for the UMR allocations. > > Fix introduced in commit 42cea83f9524 ("IB/mlx5: Fix cleanup order on unload") > revealed this fact, so this patch adds option to mark any object to be non- > traceable. > > It fixes resource tracking warnings during shutdown. > > [ 43.473906] CPU: 5 PID: 3016 Comm: modprobe Not tainted 4.16.0-rc5-for- > linust-perf-2018-03-19_07-01-58-14 #1 > [ 43.473907] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > Ubuntu-1.8.2-1ubuntu2 04/01/2014 > [ 43.473919] RIP: 0010:rdma_restrack_clean+0x25/0x30 [ib_core] > [ 43.473921] RSP: 0018:ffffc9000267be48 EFLAGS: 00010282 > [ 43.473924] RAX: 0000000000000000 RBX: ffff88033c690070 RCX: > 0000000180080006 > [ 43.473925] RDX: ffff88035ce922e0 RSI: ffffea000cf1a200 RDI: > ffff88033c6907c8 > [ 43.473926] RBP: ffff88033c690070 R08: ffff88033c689000 R09: > 0000000180080006 > [ 43.473927] R10: 000000003c68a001 R11: ffff88033c689000 R12: > ffff88033c690000 > [ 43.473929] R13: ffff88033c69005c R14: 0000000000000000 R15: > 0000000000000000 > [ 43.473932] FS: 00007f5928359740(0000) GS:ffff88036c540000(0000) > knlGS:0000000000000000 > [ 43.473933] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 43.473935] CR2: 00007ffffc760cc8 CR3: 000000035620c000 CR4: > 00000000000006e0 > [ 43.473940] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 43.473941] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > 0000000000000400 > [ 43.473942] Call Trace: > [ 43.473969] ib_unregister_device+0xf5/0x190 [ib_core] > [ 43.474000] __mlx5_ib_remove+0x2e/0x40 [mlx5_ib] > [ 43.474098] mlx5_remove_device+0xf5/0x120 [mlx5_core] > [ 43.474132] mlx5_unregister_interface+0x37/0x90 [mlx5_core] > [ 43.474142] mlx5_ib_cleanup+0xc/0x16a [mlx5_ib] > [ 43.474152] SyS_delete_module+0x159/0x260 > [ 43.474159] do_syscall_64+0x61/0x110 > [ 43.474165] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [ 43.474168] RIP: 0033:0x7f59278466b7 > [ 43.474170] RSP: 002b:00007ffffc763e38 EFLAGS: 00000202 ORIG_RAX: > 00000000000000b0 > [ 43.474172] RAX: ffffffffffffffda RBX: 000000000130d590 RCX: > 00007f59278466b7 > [ 43.474173] RDX: 0000000000000000 RSI: 0000000000000800 RDI: > 000000000130d5f8 > [ 43.474175] RBP: 0000000000000000 R08: 00007f5927b0b060 R09: > 00007f59278b6a40 > [ 43.474176] R10: 00007ffffc763bc0 R11: 0000000000000202 R12: > 0000000000000000 > [ 43.474177] R13: 0000000000000001 R14: 000000000130d5f8 R15: > 0000000000000000 > [ 43.474179] Code: 84 00 00 00 00 00 0f 1f 44 00 00 48 83 c7 28 31 c0 > eb 0c 48 83 c0 08 48 3d 00 08 00 00 74 0f 48 8d 14 07 48 8b 12 48 85 d2 > 74 e8 <0f> 0b c3 f3 c3 66 0f 1f 44 00 00 0f 1f 44 00 00 53 48 8b 47 28 > [ 43.474221] ---[ end trace e89771e2250ffc23 ]--- > > Fixes: 42cea83f9524 ("IB/mlx5: Fix cleanup order on unload") > Reviewed-by: Mark Bloch <markb@xxxxxxxxxxxx> > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> > --- > drivers/infiniband/core/core_priv.h | 8 ++++---- > drivers/infiniband/core/cq.c | 9 +++++++-- > drivers/infiniband/core/verbs.c | 8 ++++++-- > drivers/infiniband/hw/mlx5/main.c | 4 ++-- > include/rdma/ib_verbs.h | 13 +++++++++---- > include/rdma/restrack.h | 9 +++++++++ > 6 files changed, 37 insertions(+), 14 deletions(-) > > diff --git a/drivers/infiniband/core/core_priv.h > b/drivers/infiniband/core/core_priv.h > index 25bb178f6074..50df910ca5c1 100644 > --- a/drivers/infiniband/core/core_priv.h > +++ b/drivers/infiniband/core/core_priv.h > @@ -325,11 +325,11 @@ static inline struct ib_qp *_ib_create_qp(struct > ib_device *dev, > * and more importantly they are created internaly by driver, > * see mlx5 create_dev_resources() as an example. > */ > - if (attr->qp_type < IB_QPT_XRC_INI) { > - qp->res.type = RDMA_RESTRACK_QP; > + qp->res.type = RDMA_RESTRACK_QP; > + if (attr->qp_type < IB_QPT_XRC_INI) > rdma_restrack_add(&qp->res); > - } else > - qp->res.valid = false; > + else > + rdma_restrack_dontrack(&qp->res); > > return qp; > } > diff --git a/drivers/infiniband/core/cq.c b/drivers/infiniband/core/cq.c index > af5ad6a56ae4..119da8e7bd8c 100644 > --- a/drivers/infiniband/core/cq.c > +++ b/drivers/infiniband/core/cq.c > @@ -128,6 +128,7 @@ static void ib_cq_completion_workqueue(struct ib_cq > *cq, void *private) > * @comp_vector: HCA completion vectors for this CQ > * @poll_ctx: context to poll the CQ from. > * @caller: module owner name. > + * @skip_tracking: avoid resource tracking > * > * This is the proper interface to allocate a CQ for in-kernel users. A > * CQ allocated with this interface will automatically be polled from the @@ - > 136,7 +137,8 @@ static void ib_cq_completion_workqueue(struct ib_cq *cq, > void *private) > */ > struct ib_cq *__ib_alloc_cq(struct ib_device *dev, void *private, > int nr_cqe, int comp_vector, > - enum ib_poll_context poll_ctx, const char *caller) > + enum ib_poll_context poll_ctx, > + const char *caller, bool skip_tracking) > { > struct ib_cq_init_attr cq_attr = { > .cqe = nr_cqe, > @@ -162,7 +164,10 @@ struct ib_cq *__ib_alloc_cq(struct ib_device *dev, void > *private, > > cq->res.type = RDMA_RESTRACK_CQ; > cq->res.kern_name = caller; > - rdma_restrack_add(&cq->res); > + if (skip_tracking) > + rdma_restrack_dontrack(&cq->res); > + else > + rdma_restrack_add(&cq->res); > > switch (cq->poll_ctx) { > case IB_POLL_DIRECT: > diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c > index 93025d2009b8..76df533ca473 100644 > --- a/drivers/infiniband/core/verbs.c > +++ b/drivers/infiniband/core/verbs.c > @@ -238,7 +238,7 @@ EXPORT_SYMBOL(rdma_port_get_link_layer); > * memory operations. > */ > struct ib_pd *__ib_alloc_pd(struct ib_device *device, unsigned int flags, > - const char *caller) > + const char *caller, bool skip_tracking) > { > struct ib_pd *pd; > int mr_access_flags = 0; > @@ -265,7 +265,11 @@ struct ib_pd *__ib_alloc_pd(struct ib_device *device, > unsigned int flags, > > pd->res.type = RDMA_RESTRACK_PD; > pd->res.kern_name = caller; > - rdma_restrack_add(&pd->res); > + > + if (skip_tracking) > + rdma_restrack_dontrack(&pd->res); > + else > + rdma_restrack_add(&pd->res); > > if (mr_access_flags) { > struct ib_mr *mr; > diff --git a/drivers/infiniband/hw/mlx5/main.c > b/drivers/infiniband/hw/mlx5/main.c > index da091de4e69d..63fd66a8510d 100644 > --- a/drivers/infiniband/hw/mlx5/main.c > +++ b/drivers/infiniband/hw/mlx5/main.c > @@ -3473,14 +3473,14 @@ static int create_umr_res(struct mlx5_ib_dev *dev) > goto error_0; > } > > - pd = ib_alloc_pd(&dev->ib_dev, 0); > + pd = ib_alloc_pd_notrack(&dev->ib_dev, 0); > if (IS_ERR(pd)) { > mlx5_ib_dbg(dev, "Couldn't create PD for sync UMR QP\n"); > ret = PTR_ERR(pd); > goto error_0; > } > > - cq = ib_alloc_cq(&dev->ib_dev, NULL, 128, 0, IB_POLL_SOFTIRQ); > + cq = ib_alloc_cq_notrack(&dev->ib_dev, NULL, 128, 0, > IB_POLL_SOFTIRQ); > if (IS_ERR(cq)) { > mlx5_ib_dbg(dev, "Couldn't create CQ for sync UMR QP\n"); > ret = PTR_ERR(cq); > diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index > ff3ed435701f..abc4b697fc3f 100644 > --- a/include/rdma/ib_verbs.h > +++ b/include/rdma/ib_verbs.h > @@ -2867,9 +2867,11 @@ enum ib_pd_flags { }; > > struct ib_pd *__ib_alloc_pd(struct ib_device *device, unsigned int flags, > - const char *caller); > + const char *caller, bool skip_tracking); > #define ib_alloc_pd(device, flags) \ > - __ib_alloc_pd((device), (flags), KBUILD_MODNAME) > + __ib_alloc_pd((device), (flags), KBUILD_MODNAME, false) #define > +ib_alloc_pd_notrack(device, flags) \ > + __ib_alloc_pd((device), (flags), KBUILD_MODNAME, true) > void ib_dealloc_pd(struct ib_pd *pd); > > /** > @@ -3148,9 +3150,12 @@ static inline int ib_post_recv(struct ib_qp *qp, > > struct ib_cq *__ib_alloc_cq(struct ib_device *dev, void *private, > int nr_cqe, int comp_vector, > - enum ib_poll_context poll_ctx, const char *caller); > + enum ib_poll_context poll_ctx, const char *caller, > + bool skip_tracking); > #define ib_alloc_cq(device, priv, nr_cqe, comp_vect, poll_ctx) \ > - __ib_alloc_cq((device), (priv), (nr_cqe), (comp_vect), (poll_ctx), > KBUILD_MODNAME) > + __ib_alloc_cq((device), (priv), (nr_cqe), (comp_vect), (poll_ctx), > +KBUILD_MODNAME, false) #define ib_alloc_cq_notrack(device, priv, nr_cqe, > comp_vect, poll_ctx) \ > + __ib_alloc_cq((device), (priv), (nr_cqe), (comp_vect), (poll_ctx), > +KBUILD_MODNAME, true) > > void ib_free_cq(struct ib_cq *cq); > int ib_process_cq_direct(struct ib_cq *cq, int budget); diff --git > a/include/rdma/restrack.h b/include/rdma/restrack.h index > 2cdf8dcf4bdc..f593c2fdaa03 100644 > --- a/include/rdma/restrack.h > +++ b/include/rdma/restrack.h > @@ -150,4 +150,13 @@ int __must_check rdma_restrack_get(struct > rdma_restrack_entry *res); > * @res: resource entry > */ > int rdma_restrack_put(struct rdma_restrack_entry *res); > + > +/** > + * rdma_restrack_dontrack() - mark resource as not valid > + * @res: resource entry > + */ > +static inline void rdma_restrack_dontrack(struct rdma_restrack_entry > +*res) { > + res->valid = false; > +} > #endif /* _RDMA_RESTRACK_H_ */ > -- > 2.14.3 > rdma_restrack_clean() call from ib_unregister_device() is not correct. rdma_restrack_init() is called in ib_alloc_device(). The correct cleanup routine of alloc_device() is ib_dealloc_device(). Therefore rdma_restrack_clean() should be done there. Ib_unregister_device() is not the right place regardless of what is done in this patch. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html