On Fri, Sep 16, 2016 at 9:50 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > On Fri, Sep 16, 2016 at 09:29:06AM -0700, Andy Lutomirski wrote: > >> > SCHED_DEADLINE, its a 'Global'-EDF like scheduler that doesn't support >> > CPU affinities (because that doesn't make sense). The only way to >> > restrict it is to partition. >> > >> > 'Global' because you can partition it. If you reduce your system to >> > single CPU partitions you'll reduce to P-EDF. >> > >> > (The same is true of SCHED_FIFO, that's a 'Global'-FIFO on the same >> > partition scheme, it however does support sched_affinity, but using it >> > gives 'interesting' schedulability results -- call it a historic >> > accident). >> >> Hmm, I didn't realize that the deadline scheduler was global. But >> ISTM requiring the use of "exclusive" to get this working is >> unfortunate. What if a user wants two separate partitions, one using >> CPUs 1 and 2 and the other using CPUs 3 and 4 (with 5 reserved for >> non-RT stuff)? > > {1,2} {3,4} {5} seem exclusive, did I miss something? (other than that 5 > cpu parts are 'rare'). There's no overlap, so they're logically exclusive, but it avoids needing the "cpu_exclusive" parameter. It always seemed confusing to me that a setting on a child cgroup would strictly remove a resource from the parent. (To be clear: I don't have any particularly strong objection to cpu_exclusive. It just always seemed like a bit of a hack that mostly duplicated what you could get by just setting the cpusets appropriately throughout the hierarchy.) >> > Note that related, but differently, we have the isolcpus boot parameter >> > which creates single CPU partitions for all listed CPUs and gives the >> > rest to the root cpuset. Ideally we'd kill this option given its a boot >> > time setting (for something which is trivially to do at runtime). >> > >> > But this cannot be done, because that would mean we'd have to start with >> > a !0 cpuset layout: >> > >> > '/' >> > load_balance=0 >> > / \ >> > 'system' 'isolated' >> > cpus=~isolcpus cpus=isolcpus >> > load_balance=0 >> > >> > And start with _everything_ in the /system group (inclding default IRQ >> > affinities). >> > >> > Of course, that will break everything cgroup :-( >> > >> >> I would actually *much* prefer this over the status quo. I'm tired of >> my crappy, partially-working script that sits there and creates >> exactly this configuration (minus the isolcpus part because I actually >> want migration to work) on boot. (Actually, it could have two >> automatic cgroups: /kernel and /init -- init and UMH would go in init >> and kernel threads and such would go in /kernel. Userspace would be >> able to request that a different cgroup be used for newly-created >> kernel threads.) > > So there's a problem with sticking kernel threads (and esp. kthreadd) > into !root groups. For example if you place it in a cpuset that doesn't > have all cpus, then binding your shiny new kthread to a cpu will fail. > > You can fix that of course, and we used to do exactly that, but we kept > running into 'fun' cases like that. Blech. But may this *should* have that effect. I'm sick of random kernel crap being scheduled on my RT CPUs and on the CPUs that I intend to be kept forcibly idle. > > The unbound workqueue stuff is totally arbitrary borkage though, that > can be made to work just fine, TJ didn't like it for some reason which I > really cannot remember. > > Also, UMH? User mode helper. Fortunately most users are gone now, but it still exists. -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html