On Tue, May 2, 2023 at 12:30 PM Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> wrote: > > Sorry for the bad format of my previous email, there is always > something wrong when I press Ctrl+C and Ctrl+V in my browser. > Sorry for the inconvenience that I bring. > > On Tue, May 2, 2023 at 11:53 PM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote: > > > > > > > > > On May 2, 2023, at 9:50 AM, Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> wrote: > > > > > > On Tue, May 2, 2023 at 9:40 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > > >> > > >>> On Tue, May 02, 2023 at 08:01:41AM +0800, zhouzhouyi@xxxxxxxxx wrote: > > >>> From: Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> > > >>> > > >>> In kfree_rcu_test, kfree_scale_shutdown will be detected as hung task > > >>> if kfree_loops is too big. Replace wait_event with wait_event_idle > > >>> to avoid false positive. > > >>> > > >>> Tested in the PPC VM of Open Source Lab of Oregon State University. > > >>> > > >>> Suggested-by: Paul E. McKenney <paulmck@xxxxxxxxxx> > > >>> Signed-off-by: Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> > > >> > > >> Good catch, thank you! > > >> > > >> However, this commit beat you to it: > > >> > > >> ef1ef3d47677 ("rcuscale: Move shutdown from wait_event() to wait_event_idle()") > > > You are very welcome ;-) Still, this is a very fruitful learning > > > process for me ;-) > > > > Speaking of learning, can you explain why it fixes the issue? ;-) > > > > Your change log lacked the real reason but that is Ok since change logs can only tell you so much. I admit I myself did not know the reason till I read some code. > > > > Quiz: why exactly does this all work out in the end even though the hung task detector saw it hung? > When I found "blocked for more than" report in console.log, I did a > grep in kernel source tree, and found check_hung_task will only be > invoked > from check_hung_uninterruptible_tas > ks when (state & TASK_UNINTERRUPTIBLE) and !(state & TASK_WAKEKILL) > and !(state & > TASK_NOLOAD), so I send my first patch [1], and Paul tell me > wait_event_idle is the right answer, then I do a grep for recursively > expansion of macro wait_event_idle, which finally set current->state > to TASK_IDLE ((TASK_UNINTERRUPTIBLE | TASK_NOLOAD)), then I conclude > that check_hung_task will not be called. Yes, _interruptible() > variant indicates that wakeups due to things like POSIX signals are > permitted, which are not our case. That sounds right! > I should investigate more on what wait_event_idle really means, > instead of merely grep and debug what wait_event_idle does. > I should write my change log more clearly and explain why wait_event_idle works. Sounds good, the header comments around those APIs should also help. > > I discovered [2] during the testing of the patch, this is also a very > fruitful learning process. > > I will write my change log more indicative next time, and do better > work to our RCU community (and thanks for remove linux-kernel from CC > list ;-)) Sounds good, have fun! :) - Joel