On 05/12/2015 08:52 AM, Ingo Molnar wrote:
What I suggested is that it might make sense to offer a system call, for example a sched_setparam() variant, that makes such guarantees. Say if user-space does: ret = sched_setscheduler(0, BIND_ISOLATED, &isolation_params); ... then we would get the task moved to an isolated domain and get a 0 return code if the kernel is able to do all that and if the current uid/namespace/etc. has the required permissions and such.
Unfortunately I don't know nearly as much about the scheduler and scheduler policies as I might, since I mostly focused on make the scheduler stay out of the way. :-) This does seem like another way to set a policy bit on a process. I assume you could only validly issue this call on a nohz_full core, and that you're not assuming it migrates the cpu to such a core? You suggested that BIND_ISOLATED would not replace the usual scheduler policies, but perhaps SCHED_ISOLATED as a full replacement would make sense - it would make it an error to have any other schedulable task on that core. I guess that brings it around to whether the "cpu_isolated" task just loses when another task is scheduled on the core with it (the current approach I'm proposing) or if it ends up truly owning the core and other processes can be denied the right to run there: which in that case clearly does get us into the area of requiring privileges to set up, as Andy pointed out later. This would leave the notion of "strict" as proposed elsewhere as a separate thing, but presumably it could still be a prctl() as originally proposed. I admit I don't know enough to say whether this sounds like a better approach than just using a prctl() to set the cpu_isolated state. My instinct is that it's cleanest to avoid requiring permissions to do this, and to simply enable the quiescing semantics the process requested when it happens to be alone on a core. If so, it's somewhat orthogonal to the actual scheduler policy in force, so best not to conflate it with the notion of scheduler code at all via sched_setscheduler()? -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html