On 10/14, Linus Torvalds wrote: > > On Wed, 14 Oct 2009, Oleg Nesterov wrote: > > > > > void tty_flush_to_ldisc(struct tty_struct *tty) > > > { > > > - flush_to_ldisc(&tty->buf.work.work); > > > + flush_delayed_work(&tty->buf.work); > > > } > > > > Can't comment this change because I don't understand the problem. > > The work function is "flush_to_ldisc()", and what we want to make sure of > is that the work has been called. Thanks... This contradicts with > > As for tty_flush_to_ldisc(), what if tty->buf.work.work was not scheduled? > > In this case flush_delayed_work() does nothing. Is it OK? > > Yes. In fact, it would be a bonus over our current "we always call that > flush function whether it was scheduled or not" code. But I guess I understand what you meant. > If the del_timer() fails, the timer might still be running on another CPU > right at that moment, but not quite have queued the work yet. And then > we'd potentially get the wrong 'cwq' in flush_work() (we'd use the 'saved' > work), and not wait for it. Or we can get the right cwq, but since the work is not queued and it is not cwq->current_work, flush_work() correctly assumes there is nothing to do. > I wonder if we could mark the case of "workqueue is on timer" by setting > the "work->entry" list to some special value. That way > > list_empty(&work->entry) > > would always mean "it's neither pending _nor_ scheduled", and > flush_delayed_work() could have a fast-case check that at the top: > > if (list_empty(&work->entry)) > return; Yes, but we already have this - delayed_work_pending(). If it is false, it is neither pending nor scheduled. But it may be running, we can check cwq->current_work. The problem is, should we check all CPUs to detect the running case? please see below. > > And just in case... Of course, if dwork was pending and running on another CPU, > > then flush_delayed_work(dwork) can return before the running callback terminates. > > But I guess this is what we want. > > No, we want to wait for the callback to terminate, so we do want to hit > that 'flush_work()' case. Hmm. Now I am confused. OK. Lets suppose dwork's callback is running on CPU 0. A thread running on CPU 1 does queue_delayed_work(dwork, delay). Now, flush_workqueue() will flush the 2nd "queue_delayed_work" correctly, but it can return before "running on CPU 0" completes. If this is not what we want, then we have to iterate over all CPUs. As for optimization, I think you are right and flush_delayed_work() can do if (delayed_work_pending() && del_timer_sync()) { ... } flush_work(); Assuming that we should not check all CPUs. And in this case perhaps we can even do something like if (!delayed_work_pending() && get_wq_data()->current_work != dwork) return; but this needs barriers, and run_workqueue() needs mb__before_clear_bit(). I'll try to think more tomorrow, but I doubt it is possible to avoid del_timer_sync() logic. Whatever we do, if we hit the queueing in progress we should spin until it is finished. Oleg. -- To unsubscribe from this list: send the line "unsubscribe kernel-testers" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html