On Fri, Dec 20, 2024 at 01:31:06AM +0000, Chen Ridong <chenridong@xxxxxxxxxxxxxxx> wrote: > RIP: 0010:kernfs_should_drain_open_files+0x1a1/0x1b0 I assume it's this WARN_ON_ONCE(atomic_read(&kn->active) != KN_DEACTIVATED_BIAS); > It can be explained by: > rmdir echo 1 > cpuset.cpus > kernfs_fop_write_iter // active=0 > cgroup_rm_file > kernfs_remove_by_name_ns kernfs_get_active // active=1 > __kernfs_remove // active=0x80000002 > kernfs_drain cpuset_write_resmask > wait_event > //waiting (active == 0x80000001) > kernfs_break_active_protection > // active = 0x80000001 > // continue > kernfs_unbreak_active_protection > // active = 0x80000002 > ... > kernfs_should_drain_open_files > // warning occurs > kernfs_put_active Thanks for this breakdown. > To avoid deadlock. the commit 76bb5ab8f6e3 ("cpuset: > break kernfs active protection in cpuset_write_resmask()") added > 'kernfs_break_active_protection' in the cpuset_write_resmask. This could > lead to this warning. The deadlock cycle included cpuset_hotplug_work and since that was removed in the said commit, there shouldn't be same deadlock possible. Ridong, have you run your patch with CONFIG_LOCKDEP to check that eventuality? > After the commit 2125c0034c5d ("cgroup/cpuset: Make cpuset hotplug > processing synchronous"), the cpuset_write_resmask no longer needs to > wait the hotplug to finish, which means that cpuset_write_resmask won't > grab the cgroup_mutex. So the deadlock doesn't exist anymore. Therefore, > remove kernfs_break_active_protection operation in the > 'cpuset_write_resmask' > > Fixes: 76bb5ab8f6e3 ("cpuset: break kernfs active protection in cpuset_write_resmask()") This commit alone isn't sufficient to cause the warning you observed, right? As I read kernfs_break_active_protection() comment, I don't see cpuset code violating its conditions: a) it's broken/unbroken from withing a kernfs file operation handler, b) it pins the needed struct cpuset independently of kernfs_node (it's ok to be removed) All in all -- I think the particular break/unbreak pair is unncecessary nowadays and the warning implemented with hiding/showing kernfs files didn't take temporary breakage into account (only based on quick searching and vague memories). Thanks, Michal
Attachment:
signature.asc
Description: PGP signature