[RFC PATCH bpf-next v2 2/9] cgroup: Eliminate the need for cgroup_mutex in proc_cgroup_show()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The cgroup root_list is already RCU-safe. Therefore, we can replace the
cgroup_mutex with the RCU read lock in some particular paths. This change
will be particularly beneficial for frequent operations, such as
`cat /proc/self/cgroup`, in a cgroup1-based container environment.

I did stress tests with this change, as outlined below
(with CONFIG_PROVE_RCU_LIST enabled):

- Continuously mounting and unmounting named cgroups in some tasks,
  for example:

  cgrp_name=$1
  while true
  do
      mount -t cgroup -o none,name=$cgrp_name none /$cgrp_name
      umount /$cgrp_name
  done

- Continuously triggering proc_cgroup_show() in some tasks concurrently,
  for example:
  while true; do cat /proc/self/cgroup > /dev/null; done

They can ran successfully after implementing this change, with no RCU
warnings in dmesg. It's worth noting that this change can also catch
deleted cgroups, as demonstrated by running the following task at the
same time:

  while true; do grep deleted /proc/self/cgroup; done

Results in output like:

  7995:name=cgrp2: (deleted)
  8594:name=cgrp1: (deleted)

Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>
---
 kernel/cgroup/cgroup.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index bae8f9f27792..30bdb3bf1dcd 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6256,7 +6256,7 @@ int proc_cgroup_show(struct seq_file *m, struct pid_namespace *ns,
 	if (!buf)
 		goto out;
 
-	cgroup_lock();
+	rcu_read_lock();
 	spin_lock_irq(&css_set_lock);
 
 	for_each_root(root) {
@@ -6279,6 +6279,10 @@ int proc_cgroup_show(struct seq_file *m, struct pid_namespace *ns,
 		seq_putc(m, ':');
 
 		cgrp = task_cgroup_from_root(tsk, root);
+		if (!cgrp) {
+			seq_puts(m, " (deleted)\n");
+			continue;
+		}
 
 		/*
 		 * On traditional hierarchies, all zombie tasks show up as
@@ -6311,7 +6315,7 @@ int proc_cgroup_show(struct seq_file *m, struct pid_namespace *ns,
 	retval = 0;
 out_unlock:
 	spin_unlock_irq(&css_set_lock);
-	cgroup_unlock();
+	rcu_read_unlock();
 	kfree(buf);
 out:
 	return retval;
-- 
2.30.1 (Apple Git-130)





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux