Op wo 23 jun. 2021 om 19:28 schreef Michal Koutný <mkoutny@xxxxxxxx>: > > Hello Ronny. > > On Mon, Jun 14, 2021 at 05:29:35PM +0200, Ronny Meeus <ronny.meeus@xxxxxxxxx> wrote: > > All apps are running in the realtime domain and I'm using kernel 4.9 > > and cgroup v1. [...] when it enters a full load condition [...] > > I start to gradually reduce the budget of the cgroup until the system > > is idle enough. > > Has your application some RT requirements or is there other reason why > you use group RT allocations? (When your app seems to require all CPU > time, you decide to curb it. And it still fullfills RT requirements?) > The application does not have strict RT requirements. The main reason for using cgroups is to reduce the load of the high consumer applications when the system is under high load so that also lower prio apps can have a portion of the CPU. We were working with fixed croups initially but this has the big disadvantage that the unused budget configured in one group cannot be used by another group and as such the processing power is basically lost. > > > But sometimes, immediately after the process assignment, it stops for > > a short period (something like 1 or 2s) and then starts to consume 40% > > again. > > What if you reduce cpu.rt_period_us (and cpu.rt_runtime_us > proportionally)? (Are the pauses shorter?) Is there any useful info in > /proc/$PID/stack during these periods? > I tried to use shorter periods like 100ms instead of 1s but the problem is still observed. Using a proportionally reducing algo is more complex to implement and I think would not solve the issue either. About the stack: it is difficult to know from the SW when the issue happens so dumping the stack is not easy I think but it is a good idea. I will certainly think about it. To observe the system I use a spirent traffic generator which shows me the number of processed packets in a nice graph. In this way it is easy to see that there are short peaks when the system is not returning any packets. > > Is that expected behavior? > > Someone with RT group schedulling knowledge may tell :-) > > HTH, > Michal