Re: [PATCH v7 00/13] fold per-CPU vmstats remotely

Marcelo Tosatti <mtosatti@xxxxxxxxxx> · Thu, 23 Mar 2023 07:59:14 -0300

On Thu, Mar 23, 2023 at 07:52:22AM -0300, Marcelo Tosatti wrote:
> On Thu, Mar 23, 2023 at 08:51:14AM +0100, Michal Hocko wrote:
> > On Wed 22-03-23 11:20:55, Marcelo Tosatti wrote:
> > > On Wed, Mar 22, 2023 at 02:35:20PM +0100, Michal Hocko wrote:
> > [...]
> > > > > "Performance details for the kworker interruption:
> > > > > 
> > > > > oslat   1094.456862: sys_mlock(start: 7f7ed0000b60, len: 1000)
> > > > > oslat   1094.456971: workqueue_queue_work: ... function=vmstat_update ...
> > > > > oslat   1094.456974: sched_switch: prev_comm=oslat ... ==> next_comm=kworker/5:1 ...
> > > > > kworker 1094.456978: sched_switch: prev_comm=kworker/5:1 ==> next_comm=oslat ...
> > > > > 
> > > > > The example above shows an additional 7us for the
> > > > > 
> > > > >         oslat -> kworker -> oslat
> > > > > 
> > > > > switches. In the case of a virtualized CPU, and the vmstat_update
> > > > > interruption in the host (of a qemu-kvm vcpu), the latency penalty
> > > > > observed in the guest is higher than 50us, violating the acceptable
> > > > > latency threshold for certain applications."
> > > > 
> > > > Yes, I have seen that but it doesn't really give a wider context to
> > > > understand why those numbers matter.
> > > 
> > > OK.
> > > 
> > > "In the case of RAN, a MAC scheduler with TTI=1ms, this causes >100us
> > > interruption observed in a guest (which is above the safety
> > > threshold for this application)."
> > > 
> > > Is that OK?
> > 
> > This might be a sufficient information for somebody familiar with the
> > matter (not me). So no, not enough. We need to hear a more complete
> > story. 
> 
> Michal,
> 
> Please refer to 
> https://www.diva-portal.org/smash/get/diva2:541460/FULLTEXT01.pdf
> 
> 2.3 Channel Dependent Scheduling
> The purpose of scheduling is to decide which terminal will transmit data on which set
> of resource blocks with what transport format to use. The objective is to assign
> resources to the terminal such that the quality of service (QoS) requirement is fulfilled.
> Scheduling decision is taken every 1 ms by base station (termed as eNodeB) as the
> same length of Transmission Time Interval (TTI) in LTE system.
> 
> In general:
> 
> https://en.wikipedia.org/wiki/Real-time_computing
> 
> Real-time computing (RTC) is the computer science term for hardware and
> software systems subject to a "real-time constraint", for example from
> event to system response.[1] Real-time programs must guarantee response
> within specified time constraints, often referred to as "deadlines".[2]
> 
> Real-time responses are often understood to be in the order of
> milliseconds, and sometimes microseconds. A system not specified as
> operating in real time cannot usually guarantee a response within any
> timeframe, although typical or expected response times may be given.
> Real-time processing fails if not completed within a specified deadline
> relative to an event; deadlines must always be met, regardless of system
> load.
> 
> For example, for the MAC scheduler processing must occur every 1ms,
> and a certain amount of computation takes place (and must finish before
> the next 1ms timeframe). A > 50us latency spike as observed by cyclictest
> is considered a "failure".

If you need more detail, will have to ask someone else, because that is
all I know.