Oleg, Thanks for your comments. Shows how little I really understand the workqueue API :) On Tue, 2007-07-03 at 21:31 +0400, Oleg Nesterov wrote: > > I think this could lead to false positives, but then I think we > > shouldn't care about those. Let me explain. The thing is that with this > > it can happen that the code using the workqueue somehow obeys an > > ordering in the work it queues, so as far as I can tell it'd be fine to > > have two work items A and B where only B takes a lock L1, and then have > > a wait_on_work(A) under L1, if and only if B was always queued right > > after A (or something like that). > > Not sure I understand. Yes, we can have false positives, but I think the > ordering in the workqueue doesn't matter. > > If A does NOT take a lock L1, then it is OK to do cancel_work_sync(A) > under L1, regardless of which other work_structs this workqueue has, > before or after A. Ah, cancel_work_sync() waits only for it if A is currently running? > Now we have a false positive if some time we queue B into that workqueue, > and this is not good. Right. I was thinking of the flush_workqueue case where any of A or B matters. > We can avoid this problem if we put lockdep_map into work_struct, so > that wait_on_work() "locks" work->lockdep_map, while flush_workqueue() > takes wq->lockdep_map. Yeah, and then we'll take both wq->lockdep_map and the work_struct->lockdep_map when running that work. That should work, I'll give it a go later. > But probably we don't need this right now, at least until we really > have a lot of false positives while converting from flush_workqueue() > to cancel_work_sync/cancel_delayed_work_sync. I didn't really think about those yet, but I think you're right. > > @@ -257,7 +260,9 @@ static void run_workqueue(struct cpu_wor > > > > BUG_ON(get_wq_data(work) != cwq); > > work_clear_pending(work); > > + lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_); > > f(work); > > + lock_release(&cwq->wq->lockdep_map, 0, _THIS_IP_); > ^^^ > Isn't it better to call lock_release() with nested == 1 ? Not sure, Ingo? > > +#define create_workqueue(name) \ > > +({ \ > > + static struct lock_class_key __key; \ > > + struct workqueue_struct *__wq; \ > > + \ > > + __wq = __create_workqueue((name), 0, 0, &__key); \ > > + __wq; \ > > +}) > > Why do we need __wq ? No particular reason I think, I copied some other code doing it that way. > +#define create_workqueue(name) \ > ({ \ > static struct lock_class_key __key; \ > __create_workqueue((name), 0, 0, &__key); \ > }) > > Actually, I'd suggest to rename __create_workqueue() to __create_workqueue_key(), > and then you need the only change in linux/workqueue.h > > - extern struct workqueue_struct *__create_workqueue(...); > + extern struct workqueue_struct *__create_workqueue_key(..., key); > + #define __create_workqueue(...) \ > + static struct lock_class_key __key; \ > + __create_workqueue_key(..., key); \ > > but this is a matter of taste. Sure, that works. I thought about compiling out the argument completely for the no-lockdep case, would you prefer that? > Btw, I think your patch found a real bug in net/mac80211/, cool! Actually, I found the bug by experiencing it and analysing the stack traces; then I thought that it should be possible to add lockdep support for that to avoid regressing :) johannes
Attachment:
signature.asc
Description: This is a digitally signed message part