Hi Mike, > -----Original Message----- > From: linux-rdma-owner@xxxxxxxxxxxxxxx <linux-rdma- > owner@xxxxxxxxxxxxxxx> On Behalf Of Marciniszyn, Mike > Sent: Wednesday, April 17, 2019 1:56 PM > To: Leon Romanovsky <leonro@xxxxxxxxxxxx> > Cc: linux-rdma@xxxxxxxxxxxxxxx > Subject: new trace in RDMA next > > Leon, > > We seem to be getting a new trace with the RDMA for-next patches. > > Here is the trace: > > systemd: Starting Hostname Service... > kobject (00000000b8a5bae6): tried to init an initialized object, something is > seriously wrong. > CPU: 68 PID: 2098 Comm: (ostnamed) Not tainted 5.1.0-rc4 #1 > > Call Trace: > dump_stack+0x5a/0x73 > kobject_init+0x74/0x80 > kobject_init_and_add+0x35/0xb0 > hfi1_create_port_files+0x6e/0x3c0 [hfi1] > ib_setup_port_attrs+0x43b/0x560 [ib_core] > add_one_compat_dev+0x16a/0x230 [ib_core] > rdma_dev_init_net+0x110/0x160 [ib_core] > ops_init+0x38/0xf0 > setup_net+0xcf/0x1e0 > copy_net_ns+0xb7/0x130 > create_new_namespaces+0x11a/0x1b0 > unshare_nsproxy_namespaces+0x55/0xa0 > ksys_unshare+0x1a7/0x340 > __x64_sys_unshare+0xe/0x20 > do_syscall_64+0x5b/0x180 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > It seems the core is calling init_port(hfi1_create_port_files) twice, the first > during ib_register_device() and the second from the stack above. > > There is a comment in add_one_compat_dev() noting the race: > > /* > * The first of init_net() or ib_register_device() to take the > * compat_devs_mutex wins and gets to add the device. Others will wait > * for completion here. > */ > mutex_lock(&device->compat_devs_mutex); > cdev = xa_load(&device->compat_devs, rnet->id); if (cdev) { > ret = 0; > goto done; > } > > I don't see any xa_store() to compat_devs or use of the mutex in the > ib_register_device() path? > The issue is not with xa_store or lock. Issue is that hfi1_create_port_files() is trying to create sysfs files out ppd->sc2vl_kobj, which is already initialized during ib_register_device(). For every net ns, my patches are adding compat devices which probably shouldn't invoke init_port for those compat devices as it not meaningful. Attached patch should fix it. Can you please test it? I do not have hfi hw. I will inline the patch, in case attachment doesn't work. > Is this a known issue? > > Mike
Attachment:
0001-RDMA-core-Do-not-invoke-init_port-on-compat-devices.patch
Description: 0001-RDMA-core-Do-not-invoke-init_port-on-compat-devices.patch