On Fri, Jan 14, 2022 at 10:40:06AM +0100, Peter Zijlstra wrote: > On Fri, Jan 14, 2022 at 10:31:55AM +0100, Peter Zijlstra wrote: > > > Also, by virtue of this being a start-stop annotation interface, the > > accrued time might be arbitrarily large and arbitrarily delayed. I'm not > > sure that's sensible. > > > > For tasks it might be better to mark the task and have the tick DTRT > > instead of later trying to 'migrate' the time. > > Which is then very close to simply sticking the task into the right > cgroup for a limited duration. > > You could do a special case sched_move_task(), that takes a css argument > instead of using the current task_css. Then for cgroups it looks like > nothing changes, but the scheduler will DTRT and act like it is in the > target cgroup. Then at the end, simply move it back to task_css. Yes, that's one of the things I tried. Less new code in the scheduler this way. > This obviously doesn't work for SoftIRQ accounting, but that is > 'special' anyway. Softirq stuff is not otherwise under scheduler > control and has preemption disabled. It also doesn't work for memory reclaim since that shouldn't be throttled in real time. Reclaim and softirq seem to demand something that doesn't move tasks onto runqueues, that's external to avoid getting pushed off cpu.