On 10/14, Linus Torvalds wrote: > > Yeah. Basically, we want to make sure that it has been called *since it > was scheduled*. In case it has already been called and is no longer > pending at all, not calling it again is fine. > > It's just that we didn't have any way to do that "force the pending > delayed work to be scheduled", so instead we ran the scheduled function by > hand synchronously. Which then seems to have triggered other problems. Yes. But I am not sure these problems are new. I do not understand this code even remotely, but from a quick grep it seems to me it is possible that flush_to_ldisc() can race with itself even without tty_flush_to_ldisc() which calls work->func() by hand. > So I don't think it really matters in practice, but I do think that we > have that nasty hole in workqueues in general with overlapping work. I > wish I could think of a way to fix it. I don't entirely agree this is a hole, I mean everything works "as expected". But yes, I agree, users often do not realize that multithreaded workqueues imply overlapping works (unless the caller takes care). And in this case work->func() should solve the races itself. Perhaps it makes sense to introduce something like // same as queue_work(), but ensures work->func() can't race with itself int queue_work_xxx(struct workqueue_struct *wq, struct work_struct *work) { int ret = 0; if (!test_and_set_bit(WORK_STRUCT_PENDING, work_data_bits(work))) { struct cpu_workqueue_struct *cwq = get_wq_data(work); int cpu = get_cpu(); // "cwq->current_work != work" is not strictly needed, // but we don't want to pin this work to the single CPU. if (!cwq || cwq->current_work != work) cwq = wq_per_cpu(wq, cpu); __queue_work(cwq, work); put_cpu(); ret = 1; } return ret; } This way we can never have multiple instances of the same work running on different CPUs. Assuming, of course, the caller never mixes queue_work_xxx() with queue_work(). The logic for queue_delayed_work_xxx() is similar. But, this can race with cpu_down(). I think this is solvable but needs more locking. I mean, the caller of queue_work_xxx() must not use the old get_wq_data(work) if this CPU is already dead, but a simple cpu_online() is not enough, we can race with workqueue_cpu_callback(CPU_POST_DEAD) flushing this cwq, in this case we should carefully insert this work into the almost-dead queue. Or, perhaps better, instead of new helper, we can probably use the free bit in work_struct->data to mark this work/dwork as "single-instance-work". In this case __queue_work and queue_delayed_work_on should check this bit. Do you think this makes sense and can close the hole? If yes, I'll try to do this on Weekend. Oleg. -- To unsubscribe from this list: send the line "unsubscribe kernel-testers" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html