Re: [PATCH v2] locks: Filter /proc/locks output on proc pid ns

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 08/03/2016 05:28 PM, J. Bruce Fields wrote:
> On Wed, Aug 03, 2016 at 05:17:09PM +0300, Nikolay Borisov wrote:
>>
>>
>> On 08/03/2016 04:46 PM, Jeff Layton wrote:
>>> On Wed, 2016-08-03 at 10:35 +0300, Nikolay Borisov wrote:
>>>> On busy container servers reading /proc/locks shows all the locks
>>>> created by all clients. This can cause large latency spikes. In my
>>>> case I observed lsof taking up to 5-10 seconds while processing around
>>>> 50k locks. Fix this by limiting the locks shown only to those created
>>>> in the same pidns as the one the proc was mounted in. When reading
>>>> /proc/locks from the init_pid_ns show everything.
>>>>
>>>>> Signed-off-by: Nikolay Borisov <kernel@xxxxxxxx>
>>>> ---
>>>>  fs/locks.c | 6 ++++++
>>>>  1 file changed, 6 insertions(+)
>>>>
>>>> diff --git a/fs/locks.c b/fs/locks.c
>>>> index ee1b15f6fc13..751673d7f7fc 100644
>>>> --- a/fs/locks.c
>>>> +++ b/fs/locks.c
>>>> @@ -2648,9 +2648,15 @@ static int locks_show(struct seq_file *f, void *v)
>>>>  {
>>>>>  	struct locks_iterator *iter = f->private;
>>>>>  	struct file_lock *fl, *bfl;
>>>>> +	struct pid_namespace *proc_pidns = file_inode(f->file)->i_sb->s_fs_info;
>>>>> +	struct pid_namespace *current_pidns = task_active_pid_ns(current);
>>>>  
>>>>>  	fl = hlist_entry(v, struct file_lock, fl_link);
>>>>  
>>>>>> +	if ((current_pidns != &init_pid_ns) && fl->fl_nspid
>>>
>>> Ok, so when you read from a process that's in the init_pid_ns
>>> namespace, then you'll get the whole pile of locks, even when reading
>>> this from a filesystem that was mounted in a different pid_ns?
>>>
>>> That seems odd to me if so. Any reason not to just uniformly use the
>>> proc_pidns here?
>>
>> [CCing some people from openvz/CRIU]
>>
>> My train of thought was "we should have means which would be the one
>> universal truth about everything and this would be a process in the
>> init_pid_ns".
> 
> OK, but why not make that means be "mount proc from the init_pid_ns and
> read /proc/locks there".  So just replace current_pidns with proc_pidns
> in the above.  I think that's all Jeff was suggesting.

Oh, you are right. Silly me, yes, I'm happy with this and I will send a
patch.


> 
> --b.
> 
>> I don't have strong preference as long as I'm not breaking
>> userspace. As I said before - I think the CRIU guys might be using that
>> interface.
>>
>>>
>>>>>> +	    && (proc_pidns != ns_of_pid(fl->fl_nspid)))
>>>>> +		return 0;
>>>> +
>>>>>  	lock_get_status(f, fl, iter->li_pos, "");
>>>>  
>>>>>  	list_for_each_entry(bfl, &fl->fl_block, fl_block)
>>>
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux