On Thu, Aug 20, 2020 at 08:37:17AM -0300, Jason Gunthorpe wrote: > On Wed, Aug 19, 2020 at 12:15:45AM +0300, Kamal Heib wrote: > > On Tue, Aug 18, 2020 at 01:31:57PM -0300, Jason Gunthorpe wrote: > > > On Tue, Aug 18, 2020 at 05:25:04PM +0300, Kamal Heib wrote: > > > > To avoid the following kernel panic when calling kmem_cache_create() > > > > with a NULL pointer from pool_cache(), move the rxe_cache_init() to the > > > > context of device initialization. > > > > > > I think you've hit on a bigger bug than just this oops. > > > > > > rxe_net_add() should never be called before rxe_module_init(), that > > > surely subtly breaks all kinds of things. > > > > > > Maybe it is time to remove these module parameters? > > > > > Yes, I agree, this can be done in for-next. > > > > But at least can we take this patch to for-rc (stable) to fix this issue > > in stable releases? > > If you want to fix something in stable then block the module options > from working as actual module options - eg before rxe_module_init() > runs. > > Jason Something like the following patch? diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c index 907203afbd99..872ebc57ac06 100644 --- a/drivers/infiniband/sw/rxe/rxe.c +++ b/drivers/infiniband/sw/rxe/rxe.c @@ -40,6 +40,8 @@ MODULE_AUTHOR("Bob Pearson, Frank Zago, John Groves, Kamal Heib"); MODULE_DESCRIPTION("Soft RDMA transport"); MODULE_LICENSE("Dual BSD/GPL"); +bool rxe_is_loaded = false; + /* free resources for a rxe device all objects created for this device must * have been destroyed */ @@ -315,6 +317,7 @@ static int __init rxe_module_init(void) return err; rdma_link_register(&rxe_link_ops); + rxe_is_loaded = true; pr_info("loaded\n"); return 0; } @@ -326,6 +329,7 @@ static void __exit rxe_module_exit(void) rxe_net_exit(); rxe_cache_exit(); + rxe_is_loaded = false; pr_info("unloaded\n"); } diff --git a/drivers/infiniband/sw/rxe/rxe.h b/drivers/infiniband/sw/rxe/rxe.h index fb07eed9e402..d9b71b5e2fba 100644 --- a/drivers/infiniband/sw/rxe/rxe.h +++ b/drivers/infiniband/sw/rxe/rxe.h @@ -67,6 +67,8 @@ #define RXE_ROCE_V2_SPORT (0xc000) +extern bool rxe_is_loaded; + static inline u32 rxe_crc32(struct rxe_dev *rxe, u32 crc, void *next, size_t len) { diff --git a/drivers/infiniband/sw/rxe/rxe_sysfs.c b/drivers/infiniband/sw/rxe/rxe_sysfs.c index ccda5f5a3bc0..12c7ca0764d5 100644 --- a/drivers/infiniband/sw/rxe/rxe_sysfs.c +++ b/drivers/infiniband/sw/rxe/rxe_sysfs.c @@ -61,6 +61,11 @@ static int rxe_param_set_add(const char *val, const struct kernel_param *kp) struct net_device *ndev; struct rxe_dev *exists; + if (!rxe_is_loaded) { + pr_err("Please make sure to load the rdma_rxe module first\n"); + return -EINVAL; + } + len = sanitize_arg(val, intf, sizeof(intf)); if (!len) { pr_err("add: invalid interface name\n"); Thanks, Kamal