On 28/05/2020 20:29, Peter Zijlstra wrote: > On Thu, May 28, 2020 at 05:51:31PM +0100, Qais Yousef wrote: > >> In my head, the simpler version of >> >> if (rt_task(p) && !uc->user_defined) >> // update_uclamp_min >> >> Is a single branch and write to cache, so should be fast. I'm failing to see >> how this could generate an overhead tbh, but will not argue about it :-) > > Mostly true; but you also had a load of that sysctl in there, which is > likely to be a miss, and those are expensive. > > Also; if we're going to have to optimize this, less logic is in there, > the less we need to take out. Esp. for stuff that 'never' changes, like > this. > >>> It's more code, but it is all outside of the normal paths where we care >>> about performance. >> >> I am happy to take that direction if you think it's worth it. I'm thinking >> task_woken_rt() is good. But again, maybe I am missing something. > > Basic rule, if the state 'never' changes, don't touch fast paths. > > Such little things can be very difficult to measure, but at some point > they cause death-by-a-thousnd-cuts. > >>> Indeed, that one. The fact that regular distros cannot enable this >>> feature due to performance overhead is unfortunate. It means there is a >>> lot less potential for this stuff. >> >> I had a humble try to catch the overhead but wasn't successful. The observation >> wasn't missed by us too then. > > Right, I remember us doing benchmarks when we introduced all this and > clearly we missed something. I would be good if Mel can share which > benchmark hurt most so we can go have a look. IIRC, it was a local mmtests netperf-udp with various buffer sizes? At least that's what we're trying to run right now on a '2 Sockets Xeon E5 2x10-Cores (40 CPUs)' w/ 3 different kernel ((1) wo_clamp (2) tsk_uclamp (3) tskgrp_uclamp). We have currently Ubuntu Desktop on it. I think that systemd uses cgroups (especially cpu controller) differently on a (Ubuntu) Server. Maybe this has an influence here as well?