On 07/11, Oleg Nesterov wrote: > > Jens, > > I managed to convince myself I understand why 2/2 needs this change... > But rq_qos_wait() still looks suspicious to me. Why can't the main loop > "break" right after io_schedule()? rq_qos_wake_function() either sets > data->got_token = true or it doesn't wakeup the waiter sleeping in > io_schedule() > > This means that data.got_token = F at the 2nd iteration is only possible > after a spurious wakeup, right? But in this case we need to set state = > TASK_UNINTERRUPTIBLE again to avoid busy-wait looping ? Oh. I can be easily wrong, I never read this code before, but it seems to me there is another unrelated race. rq_qos_wait() can't rely on finish_wait() because it doesn't necessarily take wq_head->lock. rq_qos_wait() inside the main loop does if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) { finish_wait(&rqw->wait, &data.wq); /* * We raced with wbt_wake_function() getting a token, * which means we now have two. Put our local token * and wake anyone else potentially waiting for one. */ if (data.got_token) cleanup_cb(rqw, private_data); break; } finish_wait() + "if (data.got_token)" can race with rq_qos_wake_function() which does data->got_token = true; list_del_init(&curr->entry); rq_qos_wait() can see these changes out-of-order: finish_wait() can see list_empty_careful() == T and avoid wq_head->lock, and in this case the code above can see data->got_token = false. No? and I don't really understand has_sleeper = false; at the end of the main loop. I think it should do "has_sleeper = true", we need to execute the code above only once, right after prepare_to_wait(). But this is harmless. Oleg.