On 07/02, Johannes Berg wrote: > > On Sat, 2007-06-30 at 15:46 +0400, Oleg Nesterov wrote: > > > Johannes, could you change wait_on_work() as well? Most users of > > flush_workqueue() should be converted to use it. (to avoid a possible confusion: I meant, most users of flush_workqueue() should be converted to use cancel_work_sync/cancel_delayed_work_sync which in turn use wait_on_work()). > I think this could lead to false positives, but then I think we > shouldn't care about those. Let me explain. The thing is that with this > it can happen that the code using the workqueue somehow obeys an > ordering in the work it queues, so as far as I can tell it'd be fine to > have two work items A and B where only B takes a lock L1, and then have > a wait_on_work(A) under L1, if and only if B was always queued right > after A (or something like that). Not sure I understand. Yes, we can have false positives, but I think the ordering in the workqueue doesn't matter. If A does NOT take a lock L1, then it is OK to do cancel_work_sync(A) under L1, regardless of which other work_structs this workqueue has, before or after A. Now we have a false positive if some time we queue B into that workqueue, and this is not good. We can avoid this problem if we put lockdep_map into work_struct, so that wait_on_work() "locks" work->lockdep_map, while flush_workqueue() takes wq->lockdep_map. But probably we don't need this right now, at least until we really have a lot of false positives while converting from flush_workqueue() to cancel_work_sync/cancel_delayed_work_sync. > However, since this invariant is > rather likely to be broken by subsequent changes to the code, I think > such code should probably use two workqueues instead, and I doubt it > ever happens anyway since then people could in most cases just put both > works into one function. Well, I am not sure, think about the shared workqueues like keventd_wq... > Hence I included it in my patch below. a couple of minor nits below, > @@ -257,7 +260,9 @@ static void run_workqueue(struct cpu_wor > > BUG_ON(get_wq_data(work) != cwq); > work_clear_pending(work); > + lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_); > f(work); > + lock_release(&cwq->wq->lockdep_map, 0, _THIS_IP_); ^^^ Isn't it better to call lock_release() with nested == 1 ? > +#define create_workqueue(name) \ > +({ \ > + static struct lock_class_key __key; \ > + struct workqueue_struct *__wq; \ > + \ > + __wq = __create_workqueue((name), 0, 0, &__key); \ > + __wq; \ > +}) Why do we need __wq ? +#define create_workqueue(name) \ ({ \ static struct lock_class_key __key; \ __create_workqueue((name), 0, 0, &__key); \ }) Actually, I'd suggest to rename __create_workqueue() to __create_workqueue_key(), and then you need the only change in linux/workqueue.h - extern struct workqueue_struct *__create_workqueue(...); + extern struct workqueue_struct *__create_workqueue_key(..., key); + #define __create_workqueue(...) \ + static struct lock_class_key __key; \ + __create_workqueue_key(..., key); \ but this is a matter of taste. Btw, I think your patch found a real bug in net/mac80211/, cool! Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html