On Mon 28-02-22 18:28:26, Byungchul Park wrote: > On Thu, Feb 24, 2022 at 11:22:39AM +0100, Jan Kara wrote: > > On Thu 24-02-22 10:11:02, Byungchul Park wrote: > > > On Wed, Feb 23, 2022 at 03:48:59PM +0100, Jan Kara wrote: > > > > > KJOURNALD2(kthread) TASK1(ksys_write) TASK2(ksys_write) > > > > > > > > > > wait A > > > > > --- stuck > > > > > wait B > > > > > --- stuck > > > > > wait C > > > > > --- stuck > > > > > > > > > > wake up B wake up C wake up A > > > > > > > > > > where: > > > > > A is a wait_queue, j_wait_commit > > > > > B is a wait_queue, j_wait_transaction_locked > > > > > C is a rwsem, mapping.invalidate_lock > > > > > > > > I see. But a situation like this is not necessarily a guarantee of a > > > > deadlock, is it? I mean there can be task D that will eventually call say > > > > 'wake up B' and unblock everything and this is how things were designed to > > > > work? Multiple sources of wakeups are quite common I'd say... What does > > > > > > Yes. At the very beginning when I desgined Dept, I was thinking whether > > > to support multiple wakeup sources or not for a quite long time. > > > Supporting it would be a better option to aovid non-critical reports. > > > However, I thought anyway we'd better fix it - not urgent tho - if > > > there's any single circle dependency. That's why I decided not to > > > support it for now and wanted to gather the kernel guys' opinions. Thing > > > is which policy we should go with. > > > > I see. So supporting only a single wakeup source is fine for locks I guess. > > But for general wait queues or other synchronization mechanisms, I'm afraid > > it will lead to quite some false positive reports. Just my 2c. > > Thank you for your feedback. > > I realized we've been using "false positive" differently. There exist > the three types of code in terms of dependency and deadlock. It's worth > noting that dependencies are built from between waits and events in Dept. > > --- > > case 1. Code with an actual circular dependency, but not deadlock. > > A circular dependency can be broken by a rescue wakeup source e.g. > timeout. It's not a deadlock. If it's okay that the contexts > participating in the circular dependency and others waiting for the > events in the circle are stuck until it gets broken. Otherwise, say, > if it's not meant, then it's anyway problematic. > > 1-1. What if we judge this code is problematic? > 1-2. What if we judge this code is good? > > case 2. Code with an actual circular dependency, and deadlock. > > There's no other wakeup source than those within the circular > dependency. Literally deadlock. It's problematic and critical. > > 2-1. What if we judge this code is problematic? > 2-2. What if we judge this code is good? > > case 3. Code with no actual circular dependency, and not deadlock. > > Must be good. > > 3-1. What if we judge this code is problematic? > 3-2. What if we judge this code is good? > > --- > > I call only 3-1 "false positive" circular dependency. And you call 1-1 > and 3-1 "false positive" deadlock. > > I've been wondering if the kernel guys esp. Linus considers code with > any circular dependency is problematic or not, even if it won't lead to > a deadlock, say, case 1. Even though I designed Dept based on what I > believe is right, of course, I'm willing to change the design according > to the majority opinion. > > However, I would never allow case 1 if I were the owner of the kernel > for better stability, even though the code works anyway okay for now. So yes, I call a report for the situation "There is circular dependency but deadlock is not possible." a false positive. And that is because in my opinion your definition of circular dependency includes schemes that are useful and used in the kernel. Your example in case 1 is kind of borderline (I personally would consider that bug as well) but there are other more valid schemes with multiple wakeup sources like: We have a queue of work to do Q protected by lock L. Consumer process has code like: while (1) { lock L prepare_to_wait(work_queued); if (no work) { unlock L sleep } else { unlock L do work wake_up(work_done) } } AFAIU Dept will create dependency here that 'wakeup work_done' is after 'wait for work_queued'. Producer has code like: while (1) { lock L prepare_to_wait(work_done) if (too much work queued) { unlock L sleep } else { queue work unlock L wake_up(work_queued) } } And Dept will create dependency here that 'wakeup work_queued' is after 'wait for work_done'. And thus we have a trivial cycle in the dependencies despite the code being perfectly valid and safe. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR