On Thu, Jul 11, 2019 at 03:40:06PM +0200, Oleg Nesterov wrote: > On 07/11, Oleg Nesterov wrote: > > > > Jens, > > > > I managed to convince myself I understand why 2/2 needs this change... > > But rq_qos_wait() still looks suspicious to me. Why can't the main loop > > "break" right after io_schedule()? rq_qos_wake_function() either sets > > data->got_token = true or it doesn't wakeup the waiter sleeping in > > io_schedule() > > > > This means that data.got_token = F at the 2nd iteration is only possible > > after a spurious wakeup, right? But in this case we need to set state = > > TASK_UNINTERRUPTIBLE again to avoid busy-wait looping ? > > Oh. I can be easily wrong, I never read this code before, but it seems to > me there is another unrelated race. > > rq_qos_wait() can't rely on finish_wait() because it doesn't necessarily > take wq_head->lock. > > rq_qos_wait() inside the main loop does > > if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) { > finish_wait(&rqw->wait, &data.wq); > > /* > * We raced with wbt_wake_function() getting a token, > * which means we now have two. Put our local token > * and wake anyone else potentially waiting for one. > */ > if (data.got_token) > cleanup_cb(rqw, private_data); > break; > } > > finish_wait() + "if (data.got_token)" can race with rq_qos_wake_function() > which does > > data->got_token = true; > list_del_init(&curr->entry); > Argh finish_wait() does __set_current_state, well that's shitty. I guess we need to do data->got_token = true; smp_wmb() list_del_init(&curr->entry); and then do smp_rmb(); if (data.got_token) cleanup_cb(rqw, private_data); to be safe? > rq_qos_wait() can see these changes out-of-order: finish_wait() can see > list_empty_careful() == T and avoid wq_head->lock, and in this case the > code above can see data->got_token = false. > > No? > > and I don't really understand > > has_sleeper = false; > > at the end of the main loop. I think it should do "has_sleeper = true", > we need to execute the code above only once, right after prepare_to_wait(). > But this is harmless. We want has_sleeper = false because the second time around we just want to grab the inflight counter. Yes we should have been worken up by our special thing and so should already have data.got_token, but that sort of thinking ends in hung boxes and me having to try to mitigate thousands of boxes suddenly hitting a case we didn't think was possible. Thanks, Josef