On Mon, Jul 01, 2019 at 05:28:25PM -0400, Steven Rostedt wrote: > On Mon, 1 Jul 2019 17:13:33 -0400 > Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > > On Mon, 1 Jul 2019 17:06:02 -0400 > > Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > > > > On Mon, 1 Jul 2019 15:43:25 -0500 > > > Corey Minyard <cminyard@xxxxxxxxxx> wrote: > > > > > > > > > > I show that patch is already applied at > > > > > > > > 1921ea799b7dc561c97185538100271d88ee47db > > > > sched/completion: Fix a lockup in wait_for_completion() > > > > > > > > git describe --contains 1921ea799b7dc561c97185538100271d88ee47db > > > > v4.19.37-rt20~1 > > > > > > > > So I'm not sure what is going on. > > > > > > Bah, I'm replying to the wrong commit that I'm having issues with. > > > > > > I searched your name to find the patch that is of trouble, and picked > > > this one. > > > > > > I'll go find the problem patch, sorry for the noise on this one. > > > > > > > No, I did reply to the right email, but it wasn't the top patch I was > > having issues with. It was the patch I replied to: > > > > This change below that Sebastian marked as stable-rt is what is causing > > me an issue. Not the patch that started the thread. > > > > In fact, my system doesn't boot with this commit in 5.0-rt. > > If I revert 90e1b18eba2ae4a729 ("swait: Delete the task from after a > wakeup occured") the machine boots again. > > Sebastian, I think that's a bad commit, please revert it. Yeah. d_wait_lookup() does not use __SWAITQUEUE_INITIALIZER() to intitialize it's queue item, but uses swake_up_all(), so it goes into an infinite loop since it won't remove the item because remove isn't set. I'd suspect there are other places this is the case. -corey > > Thanks! > > -- Steve > > > > > > > > Now.. that will fix it, but I think it is also wrong. > > > > > > The problem being that it violates FIFO, something that might be more > > > important on -RT than elsewhere. > > > > > > The regular wait API seems confused/inconsistent when it uses > > > autoremove_wake_function and default_wake_function, which doesn't help, > > > but we can easily support this with swait -- the problematic thing is > > > the custom wake functions, we musn't do that. > > > > > > (also, mingo went and renamed a whole bunch of wait_* crap and didn't do > > > the same to swait_ so now its named all different :/) > > > > > > Something like the below perhaps. > > > > > > --- > > > diff --git a/include/linux/swait.h b/include/linux/swait.h > > > index 73e06e9986d4..f194437ae7d2 100644 > > > --- a/include/linux/swait.h > > > +++ b/include/linux/swait.h > > > @@ -61,11 +61,13 @@ struct swait_queue_head { > > > struct swait_queue { > > > struct task_struct *task; > > > struct list_head task_list; > > > + unsigned int remove; > > > }; > > > > > > #define __SWAITQUEUE_INITIALIZER(name) { \ > > > .task = current, \ > > > .task_list = LIST_HEAD_INIT((name).task_list), \ > > > + .remove = 1, \ > > > } > > > > > > #define DECLARE_SWAITQUEUE(name) \ > > > diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c > > > index e83a3f8449f6..86974ecbabfc 100644 > > > --- a/kernel/sched/swait.c > > > +++ b/kernel/sched/swait.c > > > @@ -28,7 +28,8 @@ void swake_up_locked(struct swait_queue_head *q) > > > > > > curr = list_first_entry(&q->task_list, typeof(*curr), task_list); > > > wake_up_process(curr->task); > > > - list_del_init(&curr->task_list); > > > + if (curr->remove) > > > + list_del_init(&curr->task_list); > > > } > > > EXPORT_SYMBOL(swake_up_locked); > > > > > > @@ -57,7 +58,8 @@ void swake_up_all(struct swait_queue_head *q) > > > curr = list_first_entry(&tmp, typeof(*curr), task_list); > > > > > > wake_up_state(curr->task, TASK_NORMAL); > > > - list_del_init(&curr->task_list); > > > + if (curr->remove) > > > + list_del_init(&curr->task_list); > > > > > > if (list_empty(&tmp)) > > > break; > > >