[resending without HTML attachment] [see http://files.zajec.net/openwrt/ubifs-dbg.html instead] On Mon, 22 Oct 2018 at 10:26, Richard Weinberger <richard@xxxxxx> wrote: > Am Montag, 22. Oktober 2018, 09:14:08 CEST schrieb Rafał Miłecki: > > On Fri, 19 Oct 2018 at 14:31, Rafał Miłecki <zajec5@xxxxxxxxx> wrote: > > > Since OpenWrt switch from kernel 4.9 to 4.14 users started randomly > > > reporting file system corruptions. OpenWrt uses overlay(fs) with > > > squashfs as lowerdir and ubifs as upperdir. Russell managed to isolate > > > & describe test case for reproducing corruption when doing a power cut > > > after first boot. > > > > > > (...) > > > > > > Can I ask you to check if there is something possibly wrong with the > > > above ovl commit? Or does it expose some problem with the ubifs? Or > > > maybe the whole UBI? > > > > > > FWIW testing above commit (and one before it) always results in single > > > error in the kernel log: > > > [ 14.250184] UBIFS error (ubi0:1 pid 637): ubifs_add_orphan: orphaned twice > > > > > > That UBIFS error doesn't occur with 4.12.14. Unfortunately it's > > > impossible to cleanly revert 3a1e819b4e80 from the top of 4.12.14. > > > > Let me provide a summary of all relevant commits & tests: > > > > By "Corruption" I mean file system corruption after power cut > > Well, is the filesystem not consistent anymore? > From what Russel explained to me, I thought the main problem is that no write back happens. > IOW the inode is present, has correct length, but no content is there (all zeros). I probably misused "corruption" word. What I meant by "corruption" was file having all "00"es instead of expected data. > Just like the typical case where userspace does not fsync. > But in your case sooner or later write back should have happened because the writeback timer > fires at some point. As you probably noticed I wrote tmptest.c - could you test it, please? See if you can reproduce the problem? Please call it with 3 arguments: 1) Path on ubifs mount point 2) Some xattr name 3) Some xattr value Please note I wait 5 seconds (this matches vm.dirty_writeback_centisecs being 500) before doing a power cut. That lets ubifs write to flash. For some reason files that got fsetxattr called for them still are all "00"es after a power cut. I did some extra testing after enabling ubifs debugging for io.c, file.c and journal.c. Debugging output looks like expected. I can clearly see wbuf_timer_callback_nolock() being called. I attached my debugging summary as ubifs-dbg.html please take a look at it in case I missed something. -- Rafał ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/