On 08/03/2016 04:46 PM, Jeff Layton wrote: > On Wed, 2016-08-03 at 10:35 +0300, Nikolay Borisov wrote: >> On busy container servers reading /proc/locks shows all the locks >> created by all clients. This can cause large latency spikes. In my >> case I observed lsof taking up to 5-10 seconds while processing around >> 50k locks. Fix this by limiting the locks shown only to those created >> in the same pidns as the one the proc was mounted in. When reading >> /proc/locks from the init_pid_ns show everything. >> >>> Signed-off-by: Nikolay Borisov <kernel@xxxxxxxx> >> --- >> fs/locks.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/fs/locks.c b/fs/locks.c >> index ee1b15f6fc13..751673d7f7fc 100644 >> --- a/fs/locks.c >> +++ b/fs/locks.c >> @@ -2648,9 +2648,15 @@ static int locks_show(struct seq_file *f, void *v) >> { >>> struct locks_iterator *iter = f->private; >>> struct file_lock *fl, *bfl; >>> + struct pid_namespace *proc_pidns = file_inode(f->file)->i_sb->s_fs_info; >>> + struct pid_namespace *current_pidns = task_active_pid_ns(current); >> >>> fl = hlist_entry(v, struct file_lock, fl_link); >> >>>> + if ((current_pidns != &init_pid_ns) && fl->fl_nspid > > Ok, so when you read from a process that's in the init_pid_ns > namespace, then you'll get the whole pile of locks, even when reading > this from a filesystem that was mounted in a different pid_ns? > > That seems odd to me if so. Any reason not to just uniformly use the > proc_pidns here? [CCing some people from openvz/CRIU] My train of thought was "we should have means which would be the one universal truth about everything and this would be a process in the init_pid_ns". I don't have strong preference as long as I'm not breaking userspace. As I said before - I think the CRIU guys might be using that interface. > >>>> + && (proc_pidns != ns_of_pid(fl->fl_nspid))) >>> + return 0; >> + >>> lock_get_status(f, fl, iter->li_pos, ""); >> >>> list_for_each_entry(bfl, &fl->fl_block, fl_block) > -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html