On Sun 03-01-16 10:13:06, Dan Williams wrote: > On Wed, Dec 23, 2015 at 11:39 AM, Ross Zwisler > <ross.zwisler@xxxxxxxxxxxxxxx> wrote: > > To properly handle fsync/msync in an efficient way DAX needs to track dirty > > pages so it is able to flush them durably to media on demand. > > > > The tracking of dirty pages is done via the radix tree in struct > > address_space. This radix tree is already used by the page writeback > > infrastructure for tracking dirty pages associated with an open file, and > > it already has support for exceptional (non struct page*) entries. We > > build upon these features to add exceptional entries to the radix tree for > > DAX dirty PMD or PTE pages at fault time. > > > > Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> > > I'm hitting the following report with the ndctl dax test [1] on > next-20151231. I bisected it to > commit 3cb108f941de "dax-add-support-for-fsync-sync-v6". I'll take a > closer look tomorrow, but in case someone can beat me to it, here's > the back-trace: > > ------------[ cut here ]------------ > kernel BUG at fs/inode.c:497! I suppose this is the check that mapping->nr_exceptional is zero, isn't it? Hum, I don't see how that could happen given we call truncate_inode_pages_final() just before the clear_inode() call which removes all the exceptional entries from the radix tree. And there's not much room for a race during umount... Does the radix tree really contain any entry or is it an accounting bug? Honza > [..] > CPU: 1 PID: 3001 Comm: umount Tainted: G O 4.4.0-rc7+ #2412 > Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > task: ffff8800da2a5a00 ti: ffff880307794000 task.ti: ffff880307794000 > RIP: 0010:[<ffffffff81280171>] [<ffffffff81280171>] clear_inode+0x71/0x80 > RSP: 0018:ffff880307797d50 EFLAGS: 00010002 > RAX: ffff8800da2a5a00 RBX: ffff8800ca2e7328 RCX: ffff8800da2a5a28 > RDX: 0000000000000001 RSI: 0000000000000005 RDI: ffff8800ca2e7530 > RBP: ffff880307797d60 R08: ffffffff82900ae0 R09: 0000000000000000 > R10: ffff8800ca2e7548 R11: 0000000000000000 R12: ffff8800ca2e7530 > R13: ffff8800ca2e7328 R14: ffff8800da2e88d0 R15: ffff8800da2e88d0 > FS: 00007f2b22f4a880(0000) GS:ffff88031fc40000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00005648abd933e8 CR3: 000000007f3fc000 CR4: 00000000000006e0 > Stack: > ffff8800ca2e7328 ffff8800ca2e7000 ffff880307797d88 ffffffffa01c18af > ffff8800ca2e7328 ffff8800ca2e74d0 ffffffffa01ec740 ffff880307797db0 > ffffffff81281038 ffff8800ca2e74c0 ffff880307797e00 ffff8800ca2e7328 > Call Trace: > [<ffffffffa01c18af>] xfs_fs_evict_inode+0x5f/0x110 [xfs] > [<ffffffff81281038>] evict+0xb8/0x180 > [<ffffffff8128113b>] dispose_list+0x3b/0x50 > [<ffffffff81282014>] evict_inodes+0x144/0x170 > [<ffffffff8126447f>] generic_shutdown_super+0x3f/0xf0 > [<ffffffff81264837>] kill_block_super+0x27/0x70 > [<ffffffff81264a53>] deactivate_locked_super+0x43/0x70 > [<ffffffff81264e9c>] deactivate_super+0x5c/0x60 > [<ffffffff81285aff>] cleanup_mnt+0x3f/0x90 > [<ffffffff81285b92>] __cleanup_mnt+0x12/0x20 > [<ffffffff810c4f26>] task_work_run+0x76/0x90 > [<ffffffff81003e3a>] syscall_return_slowpath+0x20a/0x280 > [<ffffffff8192671a>] int_ret_from_sys_call+0x25/0x9f > Code: 48 8d 93 30 03 00 00 48 39 c2 75 23 48 8b 83 d0 00 00 00 a8 20 > 74 1a a8 40 75 18 48 c7 8 > 3 d0 00 00 00 60 00 00 00 5b 41 5c 5d c3 <0f> 0b 0f 0b 0f 0b 0f 0b 0f > 0b 0f 1f 44 00 00 0f 1f > 44 00 00 55 > RIP [<ffffffff81280171>] clear_inode+0x71/0x80 > RSP <ffff880307797d50> > ---[ end trace 3b1d8898a94a4fc1 ]--- > > [1]: git://git@xxxxxxxxxx:pmem/ndctl.git pending > make TESTS="test/dax.sh" check > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html