On Tue, Dec 13, 2016 at 8:39 AM, Casey Schaufler <casey@xxxxxxxxxxxxxxxx> wrote: > On 12/13/2016 1:47 AM, Michael Kerrisk (man-pages) wrote: >> Hi John, >> >> On 13 December 2016 at 02:39, John Stultz <john.stultz@xxxxxxxxxx> wrote: >>> This patch adds CAP_GROUP_MIGRATE and logic to allows a process >> s/CAP_GROUP_MIGRATE/CAP_CGROUP_MIGRATE/ >> >>> to migrate other tasks between cgroups. >>> >>> In Android (where this feature originated), the ActivityManager >>> tracks various application states (TOP_APP, FOREGROUND, >>> BACKGROUND, SYSTEM, etc), and then as applications change >>> states, the SchedPolicy logic will migrate the application tasks >>> between different cgroups used to control the different >>> application states (for example, there is a background cpuset >>> cgroup which can limit background tasks to stay on one low-power >>> cpu, and the bg_non_interactive cpuctrl cgroup can then further >>> limit those background tasks to a small percentage of that one >>> cpu's cpu time). >>> >>> However, for security reasons, Android doesn't want to make the >>> system_server (the process that runs the ActivityManager and >>> SchedPolicy logic), run as root. So in the Android common.git >>> kernel, they have some logic to allow cgroups to loosen their >>> permissions so CAP_SYS_NICE tasks can migrate other tasks between >>> cgroups. >>> >>> I feel the approach taken there overloads CAP_SYS_NICE a bit much >>> for non-android environments. Efforts to re-use CAP_SYS_RESOURCE >>> for this purpose (which Android has since adopted) was also >>> stymied by concerns about risks from future cgroups that could be >>> considered "dangerous" by how they might change system semantics. >>> >>> So to avoid overlapping usage, this patch adds a brand new >>> process capability flag (CAP_CGROUP_MIGRATE), and uses it when >>> checking if a task can migrate other tasks between cgroups. >>> >>> I've tested this with AOSP master (though its a bit hacked in as >>> I still need to properly get the selinux bits aware of the new >>> capability bit) with selinux set to permissive and it seems to be >>> working well. >>> >>> Thoughts and feedback would be appreciated! >> So, back to the discussion of silos. I understand the argument for >> wanting a new silo. But, in that case can we at least try not to make >> it a single-use silo? >> >> How about CAP_CGROUP_CONTROL or some such, with the idea that this >> might be a capability that allows the holder to step outside usual >> cgroup rules? At the moment, that capability would allow only one such >> step, but maybe there would be others in the future. > > I agree, but want to put it more strongly. The granularity of > capabilities can never be fine enough for some people, and this > is an example of a case where you're going a bit too far. If the > use case is Android as you say, you don't need this. As my friends > on the far side of the aisle would say, "just write SELinux policy" > to correctly control access as required. So.. The trouble is that while selinux is good for restricting permissions, the in-kernel permission checks here are already too restrictive. It seems one must first loosen things up before we can tighten it with selinux rules. Or are you suggesting the system_server run as root + further selinux limitations? I worry, the Android developers may still be hesitant to do that. thanks -john -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html