This is my first post to the LKML, so please be kind :) I also have been affected by this bug. The bug is triggered whenever a write happens to the filesystem, which means mounting read-only is an available option to recover data. I took the time to do a full bisect on the kernel sources and have identified the commit where the breakage happens. Regarding versions, I can confirm that 4.19.83 is stable with regards to NILFS, and 4.19.84 and later are broken. I can also confirm that 5.3.10 works fine and have heard that 5.3.12 breaks NILFS as well. I can also confirm that the 5.4.18 kernel still has this issue. I did not trace how far back the issue goes on the 5.4.x series, or even in more detail on the 5.3.x series. To simplify my bisection task, I used the 4.19.x series, and determined that commit d3b3c0a14615c495118acc4bdca23d53eea46ed2 is the commit that breaks NILFS. Furthermore, when reverting this commit on otherwise clean 4.19.84 kernel sources, the NILFS issue does not occur anymore. I'm not familiar enough with NILFS's internals to determine why the small caching change to the kernel from that commit breaks NILFS, nor can I offer a patch to fix it (besides reverting the offending change) but I can confirm that this is the initial cause. I also know there has been alot of new changes to kernel caching in more recent (5.4 / 5.5 / 5.6) kernels, so perhaps there is still more diagnostics to do. I have the test VM that I used for bisection available if someone wants to coordinate with me to put together a patch for this, but ideally someone can take my diagnostics effort here and make use of it directly. I saved dmesg logs from both good and bad cases and I can send them if someone is interested. I can also provide some level of detailed system setup instructions to reproduce the issue. I did my testing against an existing external hard drive, but I have been able to reproduce the issue consistently against a freshly created loopback mount as well, so it is not just caused by disk corruption or an unclean unmount. - Brian On Sat, Feb 15, 2020 at 8:11 PM ARAI Shun-ichi <hermes@xxxxxxxxxxxxxxx> wrote: > > And, > > In <20200210.224609.499887311281343618.hermes@xxxxxxxxxxxxxxx>; > ARAI Shun-ichi <hermes@xxxxxxxxxxxxxxx> wrote > as Subject "Re: BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 in nilfs_segctor_do_construct": > > > Hi, > > > > FYI, reporting additional test results. > > > > I reproduced this problem with clean NILFS2 fs in previous mail. > > "clean" means that "make filesystem before every tests." > > In this mail, I tried to reproduct with/without VG/LV, LUKS, loopback. > > > > * Not reproduced > > USB stick - primary partition - NILFS2 > > USB stick - primary partition - VG/LV - NILFS2 > > USB stick - primary partition - VG/LV - LUKS - NILFS2 > > USB stick - primary partition - LUKS - VG/LV - NILFS2 > > USB stick - primary partition - LUKS - VG/LV - LUKS - NILFS2 > > /tmp (tmpfs) - regular file - NILFS2 (loopback mount, kernel 4.19.82) > > USB stick - primary partition(512MiB) - NILFS2 > > > > * Reproduced (always, immediately) > > /tmp (tmpfs) - regular file - NILFS2 (loopback mount) > > USB stick - primary partition - ext4 - regular file - NILFS2 (loopback mount) > > this loopback problem is seen in Kernel 5.5.4. > > > Test conditions: > > kernel 4.19.86 (same as previous test) > > NILFS2/ext4 filesystem, VG/LV, LUKS were made with default parameters > > size of "primary partition" in USB stick is approx. 14GiB > > size of "regular file" is approx. 512MiB > > "reproduce": mount NILFS2, touch file, sync