On Wed, Feb 23, 2022 at 03:48:59PM +0100, Jan Kara wrote: > On Wed 23-02-22 09:35:34, Byungchul Park wrote: > > On Mon, Feb 21, 2022 at 08:02:04PM +0100, Jan Kara wrote: > > > On Thu 17-02-22 20:10:04, Byungchul Park wrote: > > > > [ 9.008161] =================================================== > > > > [ 9.008163] DEPT: Circular dependency has been detected. > > > > [ 9.008164] 5.17.0-rc1-00015-gb94f67143867-dirty #2 Tainted: G W > > > > [ 9.008166] --------------------------------------------------- > > > > [ 9.008167] summary > > > > [ 9.008167] --------------------------------------------------- > > > > [ 9.008168] *** DEADLOCK *** > > > > [ 9.008168] > > > > [ 9.008168] context A > > > > [ 9.008169] [S] (unknown)(&(&journal->j_wait_transaction_locked)->dmap:0) > > > > [ 9.008171] [W] wait(&(&journal->j_wait_commit)->dmap:0) > > > > [ 9.008172] [E] event(&(&journal->j_wait_transaction_locked)->dmap:0) > > > > [ 9.008173] > > > > [ 9.008173] context B > > > > [ 9.008174] [S] down_write(mapping.invalidate_lock:0) > > > > [ 9.008175] [W] wait(&(&journal->j_wait_transaction_locked)->dmap:0) > > > > [ 9.008176] [E] up_write(mapping.invalidate_lock:0) > > > > [ 9.008177] > > > > [ 9.008178] context C > > > > [ 9.008179] [S] (unknown)(&(&journal->j_wait_commit)->dmap:0) > > > > [ 9.008180] [W] down_write(mapping.invalidate_lock:0) > > > > [ 9.008181] [E] event(&(&journal->j_wait_commit)->dmap:0) > > > > [ 9.008181] > > > > [ 9.008182] [S]: start of the event context > > > > [ 9.008183] [W]: the wait blocked > > > > [ 9.008183] [E]: the event not reachable > > > > > > So what situation is your tool complaining about here? Can you perhaps show > > > it here in more common visualization like: > > > > Sure. > > > > > TASK1 TASK2 > > > does foo, grabs Z > > > does X, grabs lock Y > > > blocks on Z > > > blocks on Y > > > > > > or something like that? Because I was not able to decipher this from the > > > report even after trying for some time... > > > > KJOURNALD2(kthread) TASK1(ksys_write) TASK2(ksys_write) > > > > wait A > > --- stuck > > wait B > > --- stuck > > wait C > > --- stuck > > > > wake up B wake up C wake up A > > > > where: > > A is a wait_queue, j_wait_commit > > B is a wait_queue, j_wait_transaction_locked > > C is a rwsem, mapping.invalidate_lock > > I see. But a situation like this is not necessarily a guarantee of a > deadlock, is it? I mean there can be task D that will eventually call say > 'wake up B' and unblock everything and this is how things were designed to > work? Multiple sources of wakeups are quite common I'd say... What does Yes. At the very beginning when I desgined Dept, I was thinking whether to support multiple wakeup sources or not for a quite long time. Supporting it would be a better option to aovid non-critical reports. However, I thought anyway we'd better fix it - not urgent tho - if there's any single circle dependency. That's why I decided not to support it for now and wanted to gather the kernel guys' opinions. Thing is which policy we should go with. > Dept do to prevent false reports in cases like this? > > > The above is the simplest form. And it's worth noting that Dept focuses > > on wait and event itself rather than grabing and releasing things like > > lock. The following is the more descriptive form of it. > > > > KJOURNALD2(kthread) TASK1(ksys_write) TASK2(ksys_write) > > > > wait @j_wait_commit > > ext4_truncate_failed_write() > > down_write(mapping.invalidate_lock) > > > > ext4_truncate() > > ... > > wait @j_wait_transaction_locked > > > > ext_truncate_failed_write() > > down_write(mapping.invalidate_lock) > > > > ext4_should_retry_alloc() > > ... > > __jbd2_log_start_commit() > > wake_up(j_wait_commit) > > jbd2_journal_commit_transaction() > > wake_up(j_wait_transaction_locked) > > up_write(mapping.invalidate_lock) > > > > I hope this would help you understand the report. > > I see, thanks for explanation! So the above scenario is impossible because My pleasure. > for anyone to block on @j_wait_transaction_locked the transaction must be > committing, which is done only by kjournald2 kthread and so that thread > cannot be waiting at @j_wait_commit. Essentially blocking on > @j_wait_transaction_locked means @j_wait_commit wakeup was already done. kjournal2 repeatedly does the wait and the wake_up so the above scenario looks possible to me even based on what you explained. Maybe I should understand how the journal things work more for furhter discussion. Your explanation is so helpful. Thank you really. Thanks, Byungchul > I guess this shows there can be non-trivial dependencies between wait > queues which are difficult to track in an automated way and without such > tracking we are going to see false positives... > > Honza > > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR