On Wed, Mar 10, 2021 at 7:42 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Wed, Mar 03, 2021, Haiwei Li wrote: > > On 21/3/3 10:09, lihaiwei.kernel@xxxxxxxxx wrote: > > > From: Haiwei Li <lihaiwei@xxxxxxxxxxx> > > > > > > In my test environment, advance_expire_delta is frequently greater than > > > the fixed LAPIC_TIMER_ADVANCE_ADJUST_MAX. And this will hinder the > > > adjustment. > > > > Supplementary details: > > > > I have tried to backport timer related features to our production > > kernel. > > > > After completed, i found that advance_expire_delta is frequently greater > > than the fixed value. It's necessary to trun the fixed to dynamically > > values. > > Does this reproduce on an upstream kernel? If so... > > 1. How much over the 10k cycle limit is the delta? > 2. Any idea what causes the large delta? E.g. is there something that can > and/or should be fixed elsewhere? > 3. Is it platform/CPU specific? Hi, Sean I have traced the flow on our production kernel and it frequently consumes more than 10K cycles from sched_out to sched_in. So two scenarios tested on Cascade lake Server(96 pcpu), v5.11 kernel. 1. only cyclictest in guest(88 vcpu and bound with isolated pcpus, w/o mwait exposed, adaptive advance lapic timer is default -1). The ratio of occurrences: greater_than_10k/total: 29/2060, 1.41% 2. cyclictest in guest(88 vcpu and not bound, w/o mwait exposed, adaptive advance lapic timer is default -1) and stress in host(no isolate). The ratio of occurrences: greater_than_10k/total: 122381/1017363, 12.03% -- Haiwei Li > > Ideally, KVM would play nice with "all" environments by default without forcing > the admin to hand-tune things.