On 02/24, Linus Torvalds wrote: > > However, I see at least one case where this exclusive wakeup seems broken: > > /* > * But because we didn't read anything, at this point we can > * just return directly with -ERESTARTSYS if we're interrupted, > * since we've done any required wakeups and there's no need > * to mark anything accessed. And we've dropped the lock. > */ > if (wait_event_interruptible_exclusive(pipe->rd_wait, > pipe_readable(pipe)) < 0) > return -ERESTARTSYS; > > and I'm wondering if the issue is that the *readers* got stuck, > Because that "return -ERESTARTSYS" path now basically will by-pass the > logic to wake up the next exclusive waiter. I think this is fine... lets denote this reader as R. > Because that "return -ERESTARTSYS" is *after* the reader has been on > the rd_wait queue - and possibly gotten the only wakeup that any of > the readers will ever get - and now it returns without waking up any > other reader. I think this can't happen. ___wait_event() does init_wait_entry(&__wq_entry, exclusive ? WQ_FLAG_EXCLUSIVE : 0); \ for (;;) { \ long __int = prepare_to_wait_event(&wq_head, &__wq_entry, state);\ \ if (condition) \ break; \ \ if (___wait_is_interruptible(state) && __int) { \ __ret = __int; \ goto __out; \ } \ \ cmd; \ } \ and in this case condition == pipe_readable(pipe), cmd == schedule(). Suppose that R got that only wakeup, and wake_up() races with some signal so that signal_pending(R) is true. In this case prepare_to_wait_event() will return -ERESTARTSYS, but ___wait_event() won't return this error code, it will check pipe_readable() and return 0. After that R will restart the main loop with wake_next_reader = true, and whatever it does it should do wake_up(pipe->rd_wait) before return. Note also that prepare_to_wait_event() removes the waiter from the wait_queue_head->head list, so another wake_up() can't pick this task. Can ___wait_event() miss the pipe_readable() event in this case? No, both wake_up() and prepare_to_wait_event() take the same wq_head->lock. What if pipe_readable() is actually false? Say, a spurios wakeup or, say, pipe_write() does wake_up(rd_wait) when another reader has already made the pipe_readable() condition false? This case looks "obviously fine" too. So I am still confused. I will wait for reply from Sapkal, then I'll try to make a debugging patch. Oleg.