On Fri, Dec 16, 2022 at 08:31:54PM -0600, Linus Torvalds wrote: > Ok, let's bring in Waiman for the rwlock side. > > On Fri, Dec 16, 2022 at 5:54 PM Boqun Feng <boqun.feng@xxxxxxxxx> wrote: > > > > Right, for a reader not in_interrupt(), it may be blocked by a random > > waiting writer because of the fairness, even the lock is currently held > > by a reader: > > > > CPU 1 CPU 2 CPU 3 > > read_lock(&tasklist_lock); // get the lock > > > > write_lock_irq(&tasklist_lock); // wait for the lock > > > > read_lock(&tasklist_lock); // cannot get the lock because of the fairness > > But this should be ok - because CPU1 can make progress and eventually > release the lock. > > So the tasklist_lock use is fine on its own - the reason interrupts > are special is because an interrupt on CPU 1 taking the lock for > reading would deadlock otherwise. As long as it happens on another > CPU, the original CPU should then be able to make progress. > > But the problem here seems to be thst *another* lock is also involved > (in this case apparently "host->lock", and now if CPU1 and CPU2 get > these two locks in a different order, you can get an ABBA deadlock. > > And apparently our lockdep machinery doesn't catch that issue, so it > doesn't get flagged. Lockdep has actually caught that; the locks involved are mention in the report (https://marc.info/?l=linux-ide&m=167094379710177&w=2). The form of report might have been better, but if anything, it doesn't mention potential involvement of tasklist_lock writer, turning that into a deadlock. OTOH, that's more or less implicit for the entire class: read_lock(A) [non-interrupt] local_irq_disable() local_irq_disable() spin_lock(B) write_lock(A) read_lock(A) [in interrupt] spin_lock(B) is what that sort of reports is about. In this case A is tasklist_lock, B is host->lock. Possible call chains for CPU1 and CPU2 are reported... I wonder why analogues of that hadn't been reported for other SCSI hosts - it's a really common pattern there... > I'm not sure what the lockdep rules for rwlocks are, but maybe lockdep > treats rwlocks as being _always_ unfair, not knowing about that "it's > only unfair when it's in interrupt context". > > Maybe we need to always make rwlock unfair? Possibly only for tasklist_lock? ISTR threads about the possibility of explicit read_lock_unfair()...