Re: unexpected CPU pressure measurements when applying cpu.max control

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



(cc'ing Johannes who knows PSI a lot better than I do)

Hello, Michael.

On Wed, Jun 26, 2024 at 09:53:55AM +1000, Michael Fitz-Payne wrote:
> In short, processes executing within a CPU-limited cgroup are contributing
> to the system-wide CPU pressure measurement. This results in misleading data
> that points toward system CPU contention, when no system-wide contention
> exists.

This is in line with how PSI aggregation is defined for other resources. It
doesn't care why the pressure condition exists. e.g. If system.slice is the
only runnable top level cgroup and it's thrashing severely due to
memory.high, the system level metrics will be reporting full memory
pressure.

...
> I've compared these tests on a 5.10.0 system as well as 6.8.9 (above).
> 
> There are two differences I can see:
> 
> - On 5.10 the 'full' line is not present in either the cgroup cpu.pressure
> interface or the kernel /proc/pressure/cpu interface. I'm assuming this was
> added in a newer kernel at some point.

Yes, because full pressures are defined in terms of CPU cycles that couldn't
be consumed due to lack of the resource, initially, we didn't have
definition for CPU full pressure. Later, we used that for measuring cpu.max
throttling. It makes some sense but can also be argued that it's not quite
the same thing.

> - On 6.8.9 the 'full' line in the cgroup cpu.pressure interface appears to
> provide accurate data based on this simple test.
> 
> As we know, the kernel 'full' measurement is undefined.

How do you mean?

> In either case, the kernel PSI interface is the canonical source from which
> we want to read the measurements for warning us of CPU contention on our
> fleet of machines. Due to this unexpected accounting, the values may be
> misleading.
> 
> Frankly, I'm not sure of what the behaviour should be. I can see the
> argument that the current value is correct, given the definition is 'some'
> tasks are waiting on CPU.

This sounds more like you want to measure local (non-hierarchical) pressure.
Maybe that makes sense although I'm not sure whether this can be defined
neatly.

Thanks.

-- 
tejun




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux