On Tue, Jun 12, 2018 at 12:46:19PM -0700, Matthew Wilcox wrote: > On Tue, Jun 12, 2018 at 01:37:41PM -0600, Ross Zwisler wrote: > > On Tue, Jun 12, 2018 at 04:31:22AM -0700, Matthew Wilcox wrote: > > > On Tue, Jun 12, 2018 at 12:40:41PM +0200, David Sterba wrote: > > > > [ 9875.174796] kernel BUG at fs/inode.c:513! > > > > > > What the ... > > > > > > Somehow the fix for that got dropped. I spent most of last week chasing > > > that problem! This is the correct code: > > > > > > http://git.infradead.org/users/willy/linux-dax.git/commitdiff/01177bb06761539af8a6c872416109e2c8b64559 > > > > > > I'll check over the patchset and see if anything else got dropped! > > > > Can you please repost when you have this sorted? > > > > I think the commit you've pointed to is in your xarray-20180601 branch, but I > > see two more recent xarray branches in your tree (xarray-20180608 and > > xarray-20180612). > > > > Basically, I don't know what is stable and what's not, and what I should be > > reviewing/testing. > > Yup, I shall. The xarray-20180612 is the most recent thing I've > published, but I'm still going over the 0601 patchset looking for other > little pieces I may have dropped. I've found a couple, and I'm updating > the 0612 branch each time I find another one. > > If you want to start looking at the DAX patches on the 0612 branch, > that wouldn't be a waste of your time. Neither would testing; I don't > think I dropped anything from the DAX patches. I tested xarray-20180612 vs next-20180612, and your patches cause a new deadlock with XFS + DAX + generic/269. Here's the output from "echo w > /proc/sysrq-trigger": [ 302.520590] sysrq: SysRq : Show Blocked State [ 302.521431] task PC stack pid father [ 302.522419] fsstress D 0 1703 1660 0x00000004 [ 302.523238] Call Trace: [ 302.523634] __schedule+0x2c5/0xad0 [ 302.524116] schedule+0x36/0x90 [ 302.524572] get_unlocked_entry+0xce/0x120 [ 302.525160] ? dax_insert_entry+0x2a0/0x2a0 [ 302.525859] grab_mapping_entry+0x1c4/0x240 [ 302.526515] dax_iomap_pte_fault+0x115/0x1140 [ 302.527181] dax_iomap_fault+0x37/0x40 [ 302.527697] __xfs_filemap_fault+0x2de/0x310 [ 302.528241] xfs_filemap_fault+0x2c/0x30 [ 302.528828] __do_fault+0x26/0x160 [ 302.529280] __handle_mm_fault+0xc96/0x1320 [ 302.529933] handle_mm_fault+0x1ba/0x3c0 [ 302.530560] __do_page_fault+0x2b4/0x590 [ 302.531105] do_page_fault+0x38/0x2c0 [ 302.531693] do_async_page_fault+0x2c/0xb0 [ 302.532274] ? async_page_fault+0x8/0x30 [ 302.532875] async_page_fault+0x1e/0x30 [ 302.533479] RIP: 0033:0x7f0224141c96 [ 302.533966] Code: Bad RIP value. [ 302.534482] RSP: 002b:00007ffdcc5a5398 EFLAGS: 00010202 [ 302.535217] RAX: 00007f0224c9b000 RBX: 00000000000bf000 RCX: 00007f0224c9b040 [ 302.536335] RDX: 0000000000002f94 RSI: 0000000000000096 RDI: 00007f0224c9b000 [ 302.537339] RBP: 000000001dcd6500 R08: 0000000000000003 R09: 00000000000bf000 [ 302.538524] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000051eb851f [ 302.539708] R13: 000000000040ab80 R14: 0000000000002f94 R15: 0000000000000000 [ 302.540875] fsstress D 0 1764 1660 0x00000004 [ 302.541680] Call Trace: [ 302.542061] __schedule+0x2c5/0xad0 [ 302.542619] schedule+0x36/0x90 [ 302.543091] get_unlocked_entry+0xce/0x120 [ 302.543763] ? dax_insert_entry+0x2a0/0x2a0 [ 302.544420] __dax_invalidate_entry+0x65/0x120 [ 302.545095] dax_delete_mapping_entry+0x13/0x20 [ 302.545654] truncate_exceptional_pvec_entries.part.15+0x215/0x220 [ 302.546520] truncate_inode_pages_range+0x2b4/0x9d0 [ 302.547277] ? up_write+0x1f/0x90 [ 302.547816] ? unmap_mapping_pages+0x62/0x130 [ 302.548535] truncate_pagecache+0x48/0x70 [ 302.549156] truncate_setsize+0x32/0x40 [ 302.549775] xfs_setattr_size+0x167/0x530 [ 302.550398] xfs_vn_setattr_size+0x57/0x170 [ 302.551013] xfs_ioc_space+0x2c6/0x3a0 [ 302.551621] ? __might_fault+0x85/0x90 [ 302.552195] xfs_file_ioctl+0xcac/0xdf0 [ 302.552856] ? __might_sleep+0x4a/0x80 [ 302.553464] ? selinux_file_ioctl+0x131/0x1f0 [ 302.554168] do_vfs_ioctl+0xa9/0x6d0 [ 302.554815] ksys_ioctl+0x75/0x80 [ 302.555365] __x64_sys_ioctl+0x1a/0x20 [ 302.555962] do_syscall_64+0x65/0x220 [ 302.556603] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 302.557341] RIP: 0033:0x7f022418e0f7 [ 302.557916] Code: Bad RIP value. [ 302.558487] RSP: 002b:00007ffdcc5a5468 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 302.559734] RAX: ffffffffffffffda RBX: 0000000000000541 RCX: 00007f022418e0f7 [ 302.560825] RDX: 00007ffdcc5a5490 RSI: 0000000040305824 RDI: 0000000000000003 [ 302.561882] RBP: 0000000000000003 R08: 0000000000000074 R09: 00007ffdcc5a547c [ 302.562943] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000071de5 [ 302.563952] R13: 0000000000405650 R14: 0000000000000000 R15: 0000000000000000 This happens for me 100% of the time, and doesn't happen at all with next-20180612. - Ross