Hello, On Fri, Mar 24, 2017 at 05:53:54PM +0100, Johannes Thumshirn wrote: > [ +Cc Tejun ] > > On Fri, Mar 24, 2017 at 11:44:55AM +0000, John Garry wrote: > > To be clear, was this the same test with isci which you initially reported? > > Yes, just echo into the PCI device's sysfs remove file and it'll trigger the > problem. > > I did some archeology and it seems as if commit bcdde7e ("sysfs: make > __sysfs_remove_dir() recursive") introduced/uncovered this behavior. I couldn't reproduce it with other devices (don't have a sas controller). > For reference, here's one of my calltraces (the first of 40!): > ------------[ cut here ]------------ > WARNING: CPU: 2 PID: 5 at fs/sysfs/group.c:241 sysfs_remove_group+0xc3/0xd0 > sysfs group 'power' not found for kobject 'end_device-6:0' > CPU: 16 PID: 5884 Comm: repro.sh Not tainted 4.11.0-rc3-libsas+ #504 > Call Trace: > dump_stack+0x85/0xc2 > __warn+0xc6/0xe0 > warn_slowpath_fmt+0x4a/0x50 > sysfs_remove_group+0xc3/0xd0 > dpm_sysfs_remove+0x52/0x60 > device_del+0x13c/0x360 > ? device_remove_file+0x14/0x20 > attribute_container_class_device_del+0x15/0x20 > transport_remove_classdev+0x4c/0x60 > ? transport_add_class_device+0x40/0x40 > attribute_container_device_trigger+0xb3/0xc0 > transport_remove_device+0x10/0x20 > sas_port_delete+0x12d/0x160 [scsi_transport_sas] > sas_deform_port+0x1bf/0x1d0 [libsas] > sas_unregister_ports+0x36/0x50 [libsas] > sas_unregister_ha+0x1b/0x40 [libsas] > isci_unregister+0x2a/0x40 [isci] > isci_pci_remove+0x52/0xb0 [isci] > ? __pm_runtime_resume+0x56/0x80 > pci_device_remove+0x34/0xb0 > device_release_driver_internal+0x158/0x210 > device_release_driver+0xd/0x10 > pci_stop_bus_device+0x85/0x90 > pci_stop_and_remove_bus_device_locked+0x15/0x30 > remove_store+0x59/0x70 > dev_attr_store+0x13/0x20 > sysfs_kf_write+0x40/0x50 > kernfs_fop_write+0x130/0x1b0 > __vfs_write+0x23/0x130 > ? rcu_read_lock_sched_held+0x6d/0x80 > ? rcu_sync_lockdep_assert+0x2a/0x50 > ? __sb_start_write+0xd7/0x1e0 > ? vfs_write+0x1a4/0x1f0 > vfs_write+0xc6/0x1f0 > SyS_write+0x44/0xa0 > entry_SYSCALL_64_fastpath+0x23/0xc6 > > But as I said, I don't belive this is a problem in the SAS transport or the > SAS drivers, but a device core or transport class. So, what's most likely happening is that the parent device or kobject which contains the attribute group has already been removed earlier and because the removal is recursive, the later explicit removal is trying to remove already removed files. It can be fixed either by reordering so that the parent node is removed after the children or simply dropping the explicit removal of children. Thanks. -- tejun