[PATCHSET cgroup-for-3.14] cgroup: restructure pidlist handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

pidlist is hanlding is quite elaborate.  Because the pidlist files -
"tasks" and "cgroup.pids" - guarantee that the result is sorted and a
task can be associated with different pids, with no inherent order
among them, depending on namespaces, it is impossible to give a
certain order to tasks of a cgroup and then just iterate through them.

Instead, we end up creating tables of the relevant ids and then sort
them before serving them out for reads.  As those tables can be huge,
we also implement logic to share those tables if the id type and
namespace match, which in turn involves reference counting those
tables and synchronizing accesses to them.

What could have been a simple iteration through the member tasks
became this unnecessary hunk of complexity because it, for some
reason, wanted to guarantee sorted output, which is extremely unusual
for this type of interface.

The refcnting is done from open() and release() callbacks, which
kernfs doesn't expose.  This patchset updates pidlist handling so that
pidlists are managed from seq_file operations proper.  As the duration
between the paired start and stop denotes a single read invocation and
we don't want to reload pidlist for each instance of consecutive read
calls, pidlist is released with time delay.  This also bounds the
stale the output of read calls can be.  This makes refcnting
unnecessary - locking is simplified and refcnting is dropped.

In the long term, we want to do away with pidlist and make this a
simple iteration over member tasks.  The last patch scrambles the sort
order of "cgroup.pids" if sane_behavior, so that the sorted
expectation is broken in the new interface and we can eventually drop
pidlist logic.

This patchset contains the following nine patches.

 0001-cgroup-don-t-skip-seq_open-on-write-only-opens-on-pi.patch
 0002-cgroup-remove-cftype-release.patch
 0003-cgroup-implement-delayed-destruction-for-cgroup_pidl.patch
 0004-cgroup-introduce-struct-cgroup_pidlist_open_file.patch
 0005-cgroup-refactor-cgroup_pidlist_find.patch
 0006-cgroup-remove-cgroup_pidlist-rwsem.patch
 0007-cgroup-load-and-release-pidlists-from-seq_file-start.patch
 0008-cgroup-remove-cgroup_pidlist-use_count.patch
 0009-cgroup-don-t-guarantee-cgroup.procs-is-sorted-if-san.patch

0001-0002 are prep patches.

0003-0008 restructure pidlist handling so that it's managed from
seq_file operations.

0009 scrames sort order of cgroup.pids if sane_behavior.

This patchset is on top of cgroup/for-3.14 edab95103d3a ("cgroup:
Merge branch 'memcg_event' into for-3.14") and available in the
following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-pidlist

diffstat follows.

 include/linux/cgroup.h |    5 
 kernel/cgroup.c        |  310 +++++++++++++++++++++++++++++++------------------
 2 files changed, 204 insertions(+), 111 deletions(-)

Thanks.

--
tejun
_______________________________________________
Containers mailing list
Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/containers




[Index of Archives]     [Cgroups]     [Netdev]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux