ebiederm@xxxxxxxxxxxx (Eric W. Biederman) writes: > Stephen Brennan <stephen.s.brennan@xxxxxxxxxx> writes: > >> The pid_revalidate() function requires dropping from RCU into REF lookup >> mode. When many threads are resolving paths within /proc in parallel, >> this can result in heavy spinlock contention as each thread tries to >> grab a reference to the /proc dentry (and drop it shortly thereafter). >> >> Allow the pid_revalidate() function to execute under LOOKUP_RCU. When >> updates must be made to the inode due to the owning task performing >> setuid(), drop out of RCU and into REF mode. > > So rather than get_task_rcu_user. I think what we want is a function > that verifies task->rcu_users > 0. > > Which frankly is just "pid_task(proc_pid(inode), PIDTYPE_PID)". > > Which is something that we can do unconditionally in pid_revalidate. > > Skipping the update of the inode is probably the only thing that needs > to be skipped. > > It looks like the code can safely rely on the the security_task_to_inode > in proc_pid_make_inode and remove the security_task_to_inode in > pid_update_inode. > This makes sense, I'll get rid of the get_task_rcu_user() stuff in a v2. > >> Signed-off-by: Stephen Brennan <stephen.s.brennan@xxxxxxxxxx> >> --- >> >> I'd like to use this patch as an RFC on this approach for reducing spinlock >> contention during many parallel path lookups in the /proc filesystem. The >> contention can be triggered by, for example, running ~100 parallel instances of >> "TZ=/etc/localtime ps -fe >/dev/null" on a 100CPU machine. The %sys utilization >> in such a case reaches around 90%, and profiles show two code paths with high >> utilization: > > Do you have a real world work-load that is behaves something like this > micro benchmark? I am just curious how severe the problem you are > trying to solve is. > We have seen this issue occur internally with monitoring scripts (perhaps a bit misconfigured, I'll admit). However I don't have an exact sample workload that I can give you. Thanks, Stephen