Hi Anna-Maria, On Wed, May 29, 2019 at 04:53:05PM +0200, Anna-Maria Gleixner wrote: > On Mon, 15 Apr 2019, Marcelo Tosatti wrote: > > [...] > > > The patch "timers: do not raise softirq unconditionally" from Thomas > > attempts to address that by checking, in the sched tick, whether its > > necessary to raise the timer softirq. https://lore.kernel.org/patchwork/patch/446045/ >> Unfortunately, it attempts to grab > > the tvec base spinlock which generates the issue described in the patch > > "Revert "timers: do not raise softirq unconditionally"". https://lore.kernel.org/patchwork/patch/552474/ > Both patches are not available in the version your patch set is based > on. Better pointers would be helpful. See above. > > > tvec_base->lock protects addition of timers to the wheel versus > > timer interrupt execution. > > The timer_base->lock (formally known as tvec_base->lock), synchronizes all > accesses to timer_base and not only addition of timers versus timer > interrupt execution. Deletion of timers, getting the next timer interrupt, > forwarding the base clock and migration of timers are protected as well by > timer_base->lock. Right. > > This patch does not grab the tvec base spinlock from irq context, > > but rather performs a lockless access to base->pending_map. > > I cannot see where this patch performs a lockless access to > timer_base->pending_map. [patch 2/3] timers: do not raise softirq unconditionally (spinlockless version) > > It handles the the race between timer addition and timer interrupt > > execution by unconditionally (in case of isolated CPUs) raising the > > timer softirq after making sure the updated bitmap is visible > > on remote CPUs. > > So after modifying a timer on a non housekeeping timer base, the timer > softirq is raised - even if there is no pending timer in the next > bucket. Only with this patch, this shouldn't be a problem - but it is an > additional raise of timer softirq and an overhead when adding a timer, > because the normal timer softirq is raised from sched tick anyway. It should be clear why this is necessary when reading [patch 2/3] timers: do not raise softirq unconditionally (spinlockless version) > > > Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx> > > > > --- > > kernel/time/timer.c | 38 ++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 38 insertions(+) > > > > Index: linux-rt-devel/kernel/time/timer.c > > =================================================================== > > --- linux-rt-devel.orig/kernel/time/timer.c 2019-04-15 13:56:06.974210992 -0300 > > +++ linux-rt-devel/kernel/time/timer.c 2019-04-15 14:21:02.788704354 -0300 > > @@ -1056,6 +1063,17 @@ > > internal_add_timer(base, timer); > > } > > > > + if (!housekeeping_cpu(base->cpu, HK_FLAG_TIMER) && > > + !(timer->flags & TIMER_DEFERRABLE)) { > > + call_single_data_t *c; > > + > > + c = per_cpu_ptr(&raise_timer_csd, base->cpu); > > + > > + /* Make sure bitmap updates are visible on remote CPUs */ > > + smp_wmb(); > > + smp_call_function_single_async(base->cpu, c); > > + } > > + > > out_unlock: > > raw_spin_unlock_irqrestore(&base->lock, flags); > > > > Could you please explain me, why you decided to use the above > implementation for raising the timer softirq after modifying a timer? Because of the following race condition which is open after "[patch 2/3] timers: do not raise softirq unconditionally (spinlockless version)": CPU-0 CPU-1 jiffies=99 runs add_timer_on, with timer->expires=100 jiffies=100 run_softirq(), sees pending bitmap clear add_timer_on returns and timer was not executed P) This race did not exist before. So by raising a softirq on the remote CPU at point P), its ensured the timer will be executed ASAP.