----- On Mar 24, 2020, at 3:30 PM, Mathieu Desnoyers mathieu.desnoyers@xxxxxxxxxxxx wrote: > ----- On Mar 24, 2020, at 2:01 PM, Tejun Heo tj@xxxxxxxxxx wrote: > >> On Thu, Mar 12, 2020 at 03:47:50PM -0400, Mathieu Desnoyers wrote: >>> The basic idea is to allow applications to pin to every possible cpu, but >>> not allow them to use this to consume a lot of cpu time on CPUs they >>> are not allowed to run. >>> >>> Thoughts ? >> >> One thing that we learned is that priority alone isn't enough in isolating cpu >> consumptions no matter how low the priority may be if the workload is latency >> sensitive. The actual computation capacity of cpus gets saturated way before cpu >> time is saturated and latency impact from lowered mips becomes noticeable. So, >> depending on workloads, allowing threads to run at the lowest priority on >> disallowed cpus might not lead to behaviors that users expect but I have no idea >> what kind of usage models you have on mind for the new system call. > [...] One possibility would be to use SCHED_IDLE scheduling class rather than SCHED_OTHER with nice +19. The unfortunate side-effect AFAIU shows up when a thread requests to be pinned on a CPU which is continuously overcommitted. It may never run. This could come as a surprise for the user. The only case where this would happen is if: - A thread is pinned on CPU N, and - CPU N is not part of the allowed mask for the task's cpuset (and is overcommitted), or - CPU N is offline, and the fallback CPU is not part of the allowed mask for the task's cpuset (and is overcommitted). Is it an acceptable behavior ? How is userspace supposed to detect this kind of situation and mitigate it ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com