On Sun, 31 Mar 2019, Alex Xu (Hello71) wrote: > Excerpts from Vineeth Pillai's message of March 25, 2019 6:08 pm: > > On Sun, Mar 24, 2019 at 11:30 AM Alex Xu (Hello71) <alex_y_xu@xxxxxxxx> wrote: > >> > >> I get this BUG in 5.1-rc1 sometimes when powering off the machine. I > >> suspect my setup erroneously executes two swapoff+cryptsetup close > >> operations simultaneously, so a race condition is triggered. > >> > >> I am using a single swap on a plain dm-crypt device on a MBR partition > >> on a SATA drive. > >> > >> I think the problem is probably related to > >> b56a2d8af9147a4efe4011b60d93779c0461ca97, so CCing the related people. > >> > > Could you please provide more information on this - stack trace, dmesg etc? > > Is it easily reproducible? If yes, please detail the steps so that I > > can try it inhouse. > > > > Thanks, > > Vineeth > > > > Some info from the BUG entry (I didn't bother to type it all, > low-quality image available upon request): > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 > #PF error: [normal kernel read fault] > PGD 0 P4D 0 > Oops: 0000 [#1] SMP > CPU: 0 Comm: swapoff Not tainted 5.1.0-rc1+ #2 > RIP: 0010:shmem_recalc_inode+0x41/0x90 > > Call Trace: > ? shmem_undo_range > ? rb_erase_cached > ? set_next_entity > ? __inode_wait_for_writeback > ? shmem_truncate_range > ? shmem_evict_inode > ? evict > ? shmem_unuse > ? try_to_unuse > ? swapcache_free_entries > ? _cond_resched > ? __se_sys_swapoff > ? do_syscall_64 > ? entry_SYSCALL_64_after_hwframe > > As I said, it only occurs occasionally on shutdown. I think it is a safe > guess that it can only occur when the swap is not empty, but possibly > other conditions are necessary, so I will test further. Thanks for the update, Alex. I'm looking into a couple of bugs with the 5.1-rc swapoff, but this one doesn't look like anything I know so far. shmem_recalc_inode() is a surprising place to crash: it's as if the igrab() in shmem_unuse() were not working. Yes, please do send Vineeth and me (or the lists) your low-quality image, in case we can extract any more info from it; and also please the disassembly of your kernel's shmem_recalc_inode(), so we can be sure of exactly what it's crashing on (though I expect that will leave me as puzzled as before). If you want to experiment with one of my fixes, not yet written up and posted, just try changing SWAP_UNUSE_MAX_TRIES in mm/swapfile.c from 3 to INT_MAX: I don't see how that issue could manifest as crashing in shmem_recalc_inode(), but I may just be too stupid to see it. Thanks, Hugh