On Wed, 18 Nov 2009, malahal@xxxxxxxxxx wrote: > Alasdair G Kergon [agk@xxxxxxxxxx] wrote: > > > After further testing I've hit a lockdep trace. My testing was with > > > handing over on the same device. I had the snapshot (of an ext3 FS) > > > mounted and I was doing a sequential direct-io write to a file in the > > > FS. While writing I triggered a handover with the following: > > > > > ======================================================= > > > [ INFO: possible circular locking dependency detected ] > > > 2.6.32-rc6-snitm #8 > > > ------------------------------------------------------- > > > dmsetup/1827 is trying to acquire lock: > > > (&md->suspend_lock){+.+...}, at: [<ffffffffa00678d8>] dm_swap_table+0x2d/0x249 [dm_mod] > > > > > > but task is already holding lock: > > > (&journal->j_barrier){+.+...}, at: [<ffffffff8119192d>] journal_lock_updates+0xe1/0xf0 > > > > > > which lock already depends on the new lock. > > > > I'm going to assume this is bogus - and I can't spot any annotations > > available to suppress it, so people will just have to ignore it. > > > > Suspend involves: > > get suspend lock > > if dev is not already suspended > > get journal lock > > set state "dev is suspended" > > release suspend lock > > > > Resume involves > > [journal lock is held] > > get suspend lock > > if dev is suspended > > release journal lock > > set state "dev is not suspended" > > release suspend lock > > > > It looks as if lockdep sees that as a problem: > > Imagine those two sections running in parallel without the "Is dev > > suspended?" check, of which lockdep has no knowledge. > > Agreed, but is it possible to restructure the suspend code such that it > acquires the journal lock before the suspend lock, and then releases the > journal lock if dev is already suspended? This needs some explaining > in a comment form though! :-) > > Thanks, Malahal. The real reason for the warning is that ext3 jbd takes the mutex on suspend (ext3_freeze) and then keeps it taken until resume (ext3_unfreeze). You also get a warning if you issue "dmsetup suspend" manually, it warns that a task exists with mutex held. If suspending and resuming manually, suspend and resume are done from different processes, thus ext3 is violating mutex specification Documentation/mutex-design.txt * - only the owner can unlock the mutex * - task may not exit with mutex held It is really a bug in ext3 and ext4 and device mapper has nothing to do with it, except that it triggers it. The bug should be reported to ext[34] maintainers and fixed there. Mikulas -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel