On Thu, 2023-09-07 at 16:59 +0800, Hao Jia wrote: > > On 2023/9/5 Peter Zijlstra wrote: > > On Thu, Aug 31, 2023 at 04:48:29PM +0800, Hao Jia wrote: > > > > > If I understand correctly, rq->clock_update_flags may be set to > > > RQCF_ACT_SKIP after __schedule() holds the rq lock, and sometimes the rq > > > lock may be released briefly in __schedule(), such as newidle_balance(). At > > > this time Other CPUs hold this rq lock, and then calling > > > rq_clock_start_loop_update() may trigger this warning. > > > > > > This warning check might be wrong. We need to add assert_clock_updated() to > > > check that the rq clock has been updated before calling > > > rq_clock_start_loop_update(). > > > > > > Maybe some things can be like this? > > > > Urgh, aside from it being white space mangled, I think this is entirely > > going in the wrong direction. > > > > Leaking ACT_SKIP is dodgy as heck.. it's entirely too late to think > > clearly though, I'll have to try again tomorrow. I am trying to understand why this is an ACT_SKIP leak. Before call to __cfsb_csd_unthrottle(), is it possible someone else lock the runqueue, set ACT_SKIP and release rq_lock? And then that someone never update the rq_clock? > > Hi Peter, > > Do you think this fix method is correct? Or should we go back to the > beginning and move update_rq_clock() from unthrottle_cfs_rq()? > If anyone who locked the runqueue set ACT_SKIP also will update rq_clock, I think your change is okay. Otherwise rq_clock could be missing update. Thanks. Tim