Re: [PATCH RFC] proc: Fix a dentry lock race between release_task and lookup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,
If I understand correctly you are saying that under some circumstances
this code runs slow, and you are proposing an optimization.


That optimization is to change the content of the pid->inodes list
from all directories under that pid, to just the /proc/<tgid> and
/proc/<tgid>/task/<pid>.


Yes and yes.

The justification being that d_invalidate on the parent directory will
invalidate all children.  So only those two directories are interesting
from a d_invalidate point of view.

Absolutely right.
That seems like a valid optimization.

This could also count as a regression fix if you can show how the
performance changed poorly when the pid->inodes change was introduced
and how the performance improves with your change.   I currently only
see that you hit a pathological case and you are correcting it.

There is a reproducer attached in commit msg:
https://bugzilla.kernel.org/show_bug.cgi?id=216054

Create 200 tasks, each task open one file for 50,000 times. Kill all tasks when opened files exceed 10,000,000 (cat /proc/sys/fs/file-nr).
Before fix:
$ time killall -wq aa
real 4m40.946s # During this period, we can see 'ps' and 'systemd' taking high cpu usage.

After fix:
$ time killall -wq aa
real 1min~ # During this period, we can see 'systemd' taking high cpu usage.


As for the actual code change I think it would be better to
remove the code from proc_pid_make_inode and make a helper
proc_pid_make_base_inode that performs the extra work of
adding to the pid->list.  Not adding a flag makes the code
easier to follow.

Agree, will send a v2.
Something like the code below.

diff --git a/fs/proc/base.c b/fs/proc/base.c
index d654ce7150fd..9d025e70ddc3 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1915,11 +1915,6 @@ struct inode *proc_pid_make_inode(struct super_block * sb,
/* Let the pid remember us for quick removal */
  	ei->pid = pid;
-	if (S_ISDIR(mode)) {
-		spin_lock(&pid->lock);
-		hlist_add_head_rcu(&ei->sibling_inodes, &pid->inodes);
-		spin_unlock(&pid->lock);
-	}
task_dump_owner(task, 0, &inode->i_uid, &inode->i_gid);
  	security_task_to_inode(task, inode);
@@ -1932,6 +1927,27 @@ struct inode *proc_pid_make_inode(struct super_block * sb,
  	return NULL;
  }
+struct inode *proc_pid_make_base_inode(struct super_block * sb,
+				       struct task_struct *task, umode_t mode)
+{
+	struct inode * inode;
+	struct proc_inode *ei;
+	struct pid *pid;
+
+	inode = proc_pid_make_inode(sb, task, mode);
+	if (!inode)
+		return NULL;
+
+	/* Let proc_flush_pid find this directory inode */
+	ei = PROC_I(inode);
+	pid = ei->pid;
+	spin_lock(&pid->lock);
+	hlist_add_head_rcu(&ei->sibling_inodes, &pid->inodes);
+	spin_unlock(&pid->lock);
+
+	return inode;
+}
+
  int pid_getattr(struct user_namespace *mnt_userns, const struct path *path,
  		struct kstat *stat, u32 request_mask, unsigned int query_flags)
  {
@@ -3351,7 +3367,7 @@ static struct dentry *proc_pid_instantiate(struct dentry * dentry,
  {
  	struct inode *inode;
- inode = proc_pid_make_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO);
+	inode = proc_pid_make_base_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO);
  	if (!inode)
  		return ERR_PTR(-ENOENT);
@@ -3650,7 +3666,7 @@ static struct dentry *proc_task_instantiate(struct dentry *dentry,
  	struct task_struct *task, const void *ptr)
  {
  	struct inode *inode;
-	inode = proc_pid_make_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO);
+	inode = proc_pid_make_base_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO);
  	if (!inode)
  		return ERR_PTR(-ENOENT);
Eric
.





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux