Le Thu, Apr 04, 2024 at 06:50:26PM +0200, Anna-Maria Behnsen a écrit : > Commit 4b6f4c5a67c0 ("timer/migration: Remove buggy early return on > deactivation") removed the logic to return early in tmigr_update_events() > on deactivation. With this the problem with a not properly updated first > global event in a hierarchy containing only a single group was fixed. > > But when having a look at this code path with a hierarchy with more than a > single level, now unnecessary work is done (example is partially copied > from the message of the commit mentioned above): > > [GRP1:0] > migrator = GRP0:0 > active = GRP0:0 > nextevt = T0:0i, T0:1 > / \ > [GRP0:0] [GRP0:1] > migrator = 0 migrator = NONE > active = 0 active = NONE > nextevt = T0i, T1 nextevt = T2 > / \ / \ > 0 (T0i) 1 (T1) 2 (T2) 3 > active idle idle idle > > 0) CPU 0 is active thus its event is ignored (the letter 'i') and so are > upper levels' events. CPU 1 is idle and has the timer T1 enqueued. > CPU 2 also has a timer. The expiry order is T0 (ignored) < T1 < T2 > > [GRP1:0] > migrator = GRP0:0 > active = GRP0:0 > nextevt = T0:0i, T0:1 > / \ > [GRP0:0] [GRP0:1] > migrator = NONE migrator = NONE > active = NONE active = NONE > nextevt = T1 nextevt = T2 > / \ / \ > 0 (T0i) 1 (T1) 2 (T2) 3 > idle idle idle idle > > 1) CPU 0 goes idle without global event queued. Therefore KTIME_MAX is > pushed as its next expiry and its own event kept as "ignore". Without this > early return the following steps happen in tmigr_update_events() when > child = null and group = GRP0:0 : > > lock(GRP0:0->lock); > timerqueue_del(GRP0:0, T0i); > unlock(GRP0:0->lock); > > > [GRP1:0] > migrator = NONE > active = NONE > nextevt = T0:0, T0:1 > / \ > [GRP0:0] [GRP0:1] > migrator = NONE migrator = NONE > active = NONE active = NONE > nextevt = T1 nextevt = T2 > / \ / \ > 0 (T0i) 1 (T1) 2 (T2) 3 > idle idle idle idle > > 2) The change now propagates up to the top. Then tmigr_update_events() > updates the group event of GRP0:0 and executes the following steps > (child = GRP0:0 and group = GRP0:0): > > lock(GRP0:0->lock); > lock(GRP1:0->lock); > evt = tmigr_next_groupevt(GRP0:0); -> this removes the ignored events > in GRP0:0 > ... update GRP1:0 group event and timerqueue ... > unlock(GRP1:0->lock); > unlock(GRP0:0->lock); > > So the dance in 1) with locking the GRP0:0->lock and removing the T0i from > the timerqueue is redundand as this is done nevertheless in 2) when > tmigr_next_groupevt(GRP0:0) is executed. > > Revert commit 4b6f4c5a67c0 ("timer/migration: Remove buggy early return on > deactivation") and add a condition into return path to skip the return > only, when hierarchy contains a single group. > > Fixes: 4b6f4c5a67c0 ("timer/migration: Remove buggy early return on deactivation") > Signed-off-by: Anna-Maria Behnsen <anna-maria@xxxxxxxxxxxxx> Reviewed-by: Frederic Weisbecker <frederic@xxxxxxxxxx> Just some comment nits: > --- > kernel/time/timer_migration.c | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > --- a/kernel/time/timer_migration.c > +++ b/kernel/time/timer_migration.c > @@ -751,6 +751,31 @@ bool tmigr_update_events(struct tmigr_gr > > first_childevt = evt = data->evt; > > + /* > + * Walking the hierarchy is required in any case when a > + * remote expiry was done before. This ensures to not lose > + * already queued events in non active groups (see section > + * "Required event and timerqueue update after a remote > + * expiry" in the documentation at the top). > + * > + * The two call sites which are executed without a remote expiry > + * before, are not prevented from propagating changes through > + * the hierarchy by the return: > + * - When entering this path by tmigr_new_timer(), @evt->ignore > + * is never set. > + * - tmigr_inactive_up() takes care of the propagation by > + * itself and ignores the return value. But an immediate > + * return is required because nothing has to be done in this > + * level as the event could be ignored. It's not exactly required, it's an optimization. How about: """ But an immediate return is possible if there is a parent, sparing group locking at this level, because the upper walking call to the parent will take care about removing this event from within the group and update next_expiry accordingly. """ > + * > + * But, if the hierarchy has only a single level so @group is > + * the top level group, make sure first event information of the > + * group is updated properly and also handled properly, so skip > + * this fast return path. """ However if there is no parent, ie: the hierarchy has only a single level so @group is the top level group, make sure the first event information of the group is updated properly and also handled properly, so skip this fast return path. """ Thanks. > + */ > + if (evt->ignore && !remote && group->parent) > + return true; > + > raw_spin_lock(&group->lock); > > childstate.state = 0;