On Fri, Oct 15, 2021 at 10:31:31AM -0700, Luis Chamberlain wrote: > On Fri, Oct 15, 2021 at 04:36:11PM +0800, Ming Lei wrote: > > On Thu, Oct 14, 2021 at 05:22:40PM -0700, Luis Chamberlain wrote: > > > On Fri, Oct 15, 2021 at 07:52:04AM +0800, Ming Lei wrote: > > ... > > > > > > > > We need to understand the exact reason why there is still cpuhp node > > > > left, can you share us the exact steps for reproducing the issue? > > > > Otherwise we may have to trace and narrow down the reason. > > > > > > See my commit log for my own fix for this issue. > > > > OK, thanks! > > > > I can reproduce the issue, and the reason is that reset_store fails > > zram_remove() when unloading module, then the warning is caused. > > > > The top 3 patches in the following tree can fix the issue: > > > > https://github.com/ming1/linux/commits/my_v5.15-blk-dev > > Thanks for trying an alternative fix! A crash stops yes, however this I doubt it is alternative since your patchset doesn't mention the exact reason of 'Error: Removing state 63 which has instances left.', that is simply caused by failing to remove zram because ->claim is set during unloading module. Yeah, you mentioned the race between disksize_store() vs. zram_remove(), however I don't think it is reproduced easily in the test because the race window is pretty small, also it can be fixed easily in my 3rd path without any complicated tricks. Not dig into details of your patchset via grabbing module reference count during show/store attribute of kernfs which is done in your patch 9, but IMO this way isn't necessary: 1) any driver module has to cleanup anything which may refer to symbols or data defined in module_exit of this driver 2) device_del() is often done in module_exit(), once device_del() returns, no any new show/store on the device's kobject attribute is possible. 3) it is _not_ a must or pattern for fixing bugs to hold one lock before calling device_del(), meantime the lock is required in the device's attribute show()/store(), which causes AA deadlock easily. Your approach just avoids the issue by not releasing module until all show/store are done. Also the model of using module refcount is usually that if anyone will use the module, grab one extra ref, and once the use is done, release it. For example of block device, the driver's module refcnt is grabbed when the disk/part is opened, and released when the disk/part is closed. > also ends up leaving the driver in an unrecoverable state after a few > tries. Ie, you CTRL-C the scripts and try again over and over again and > the driver ends up in a situation where it just says: > > zram: Can't change algorithm for initialized device It means the algorithm can't be changed for one initialized device at the exact time. That is understandable because two zram02.sh are running concurrently. Your test script just runs two ./zram02.sh tasks concurrently forever, so what is your expected result for the test? Of course, it can't be over. I can't reproduce the 'unrecoverable' state in my test, can you share the stack trace log after that happens? Is the zram02.sh still running or slept somewhere in the 'unrecoverable' state? If it is still running, it means the current sleep point isn't interruptable when running 'CTRL-C'. In my test, after several 'CTRL-C', both the two zram02.sh started from two terminals can be terminated. If it is slept somewhere forever, it can be one problem. > > And the zram module can't be removed at that point. It is just that systemd opens the zram or the disk is opened as swap disk, and once systemd closes it or after you run swapoff, it can be unloaded. Thanks, Ming