Re: [PATCH v1] cgroup/cpuset: remove kernfs active break

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 12/22/24 9:12 PM, Chen Ridong wrote:

On 2024/12/20 23:13, Waiman Long wrote:
On 12/20/24 1:11 AM, Chen Ridong wrote:
On 2024/12/20 12:16, Waiman Long wrote:
On 12/19/24 11:07 PM, chenridong wrote:
On 2024/12/20 10:55, Waiman Long wrote:
On 12/19/24 8:31 PM, Chen Ridong wrote:
From: Chen Ridong <chenridong@xxxxxxxxxx>

A warning was found:

WARNING: CPU: 10 PID: 3486953 at fs/kernfs/file.c:828
CPU: 10 PID: 3486953 Comm: rmdir Kdump: loaded Tainted: G
RIP: 0010:kernfs_should_drain_open_files+0x1a1/0x1b0
RSP: 0018:ffff8881107ef9e0 EFLAGS: 00010202
RAX: 0000000080000002 RBX: ffff888154738c00 RCX: dffffc0000000000
RDX: 0000000000000007 RSI: 0000000000000004 RDI: ffff888154738c04
RBP: ffff888154738c04 R08: ffffffffaf27fa15 R09: ffffed102a8e7180
R10: ffff888154738c07 R11: 0000000000000000 R12: ffff888154738c08
R13: ffff888750f8c000 R14: ffff888750f8c0e8 R15: ffff888154738ca0
FS:  00007f84cd0be740(0000) GS:ffff8887ddc00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555f9fbe00c8 CR3: 0000000153eec001 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
     kernfs_drain+0x15e/0x2f0
     __kernfs_remove+0x165/0x300
     kernfs_remove_by_name_ns+0x7b/0xc0
     cgroup_rm_file+0x154/0x1c0
     cgroup_addrm_files+0x1c2/0x1f0
     css_clear_dir+0x77/0x110
     kill_css+0x4c/0x1b0
     cgroup_destroy_locked+0x194/0x380
     cgroup_rmdir+0x2a/0x140
Were you using cgroup v1 or v2 when this warning happened?
I was using cgroup v1.
Thanks for the confirmation.
It can be explained by:
rmdir                 echo 1 > cpuset.cpus
                   kernfs_fop_write_iter // active=0
cgroup_rm_file
kernfs_remove_by_name_ns    kernfs_get_active // active=1
__kernfs_remove                      // active=0x80000002
kernfs_drain            cpuset_write_resmask
wait_event
//waiting (active == 0x80000001)
                   kernfs_break_active_protection
                   // active = 0x80000001
// continue
                   kernfs_unbreak_active_protection
                   // active = 0x80000002
...
kernfs_should_drain_open_files
// warning occurs
                   kernfs_put_active

This warning is caused by 'kernfs_break_active_protection' when it is
writing to cpuset.cpus, and the cgroup is removed concurrently.

The commit 3a5a6d0c2b03 ("cpuset: don't nest cgroup_mutex inside
get_online_cpus()") made cpuset_hotplug_workfn asynchronous, which
grabs
the cgroup_mutex. To avoid deadlock. the commit 76bb5ab8f6e3
("cpuset:
break kernfs active protection in cpuset_write_resmask()") added
'kernfs_break_active_protection' in the cpuset_write_resmask. This
could
lead to this warning.

After the commit 2125c0034c5d ("cgroup/cpuset: Make cpuset hotplug
processing synchronous"), the cpuset_write_resmask no longer needs to
wait the hotplug to finish, which means that cpuset_write_resmask
won't
grab the cgroup_mutex. So the deadlock doesn't exist anymore.
Therefore,
remove kernfs_break_active_protection operation in the
'cpuset_write_resmask'
The hotplug operation itself is now being done synchronously, but task
transfer (cgroup_transfer_tasks()) because of lacking online CPUs is
still being done asynchronously. So kernfs_break_active_protection()
will still be needed for cgroup v1.

Cheers,
Longman


Thank you, Longman.
IIUC, The commit 2125c0034c5d ("cgroup/cpuset: Make cpuset hotplug
processing synchronous") deleted the 'flush_work(&cpuset_hotplug_work)'
in the cpuset_write_resmask. And I do not see any process within the
cpuset_write_resmask that will grab cgroup_mutex, except for
'flush_work(&cpuset_hotplug_work)'.

Although cgroup_transfer_tasks() is asynchronous, the
cpuset_write_resmask will not wait any work that will grab
cgroup_mutex.
Consequently, the deadlock does not exist anymore.

Did I miss something?
Right. The flush_work() call is still needed for a different work
function. cpuset_write_resmask() will not need to grab cgroup_mutex, but
the asynchronously executed cgroup_transfer_tasks() will. I will work on
a patch to fix that issue.

Cheers,
Longman
If flush_work() is added back, this warning still exists. Do you have a
idea to fix this warning?
I was wrong. The flush_work() call isn't needed in this case and we
shouldn't need to break kernfs protection. However, your patch
description isn't quite right.

After the commit 2125c0034c5d ("cgroup/cpuset: Make cpuset hotplug
processing synchronous"), the cpuset_write_resmask no longer needs to
wait the hotplug to finish, which means that cpuset_write_resmask won't
grab the cgroup_mutex. So the deadlock doesn't exist anymore.
cpuset_write_resmask() never needs to grab the cgroup_mutex. The act of
calling flush_work() can create a multiple processes circular locking
dependency that involve cgroup_mutex which can cause a deadlock. After
making cpuset hotplug synchronous, concurrent hotplug and cpuset
operations are no longer possible. However, concurrent task transfer out
of a previously empty CPU cpuset and adding CPU back to that cpuset is
possible. This will result in what the comment said "keep removing tasks
added
after execution capability is restored". That should be rare though and
we should probably add a check in cgroup_transfer_tasks() to detect such
a case and break out of it.

Cheers,
Longman
Hi, Longman, sorry the confused message. Do you mean this patch is
acceptable if I update the message?
Sorry for the late reply. Yes, the patch is acceptable, but the patch description isn't quite right. Please sent out a v2.

I don't think we need to add a check in the cgroup_transfer_tasks
function. Because no process(except for writing cpuset.cpus, which has
been reoved) will need 'kn->active' to involve cgroup_transfer_tasks now.

I agree that we don't need to add a check in cgroup_transfer_tasks().

Cheers,
Longman






[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux