On Mon, Oct 22, 2018 at 11:26 AM Richard Weinberger <richard@xxxxxx> wrote: > > Am Montag, 22. Oktober 2018, 09:14:08 CEST schrieb Rafał Miłecki: > > On Fri, 19 Oct 2018 at 14:31, Rafał Miłecki <zajec5@xxxxxxxxx> wrote: > > > Since OpenWrt switch from kernel 4.9 to 4.14 users started randomly > > > reporting file system corruptions. OpenWrt uses overlay(fs) with > > > squashfs as lowerdir and ubifs as upperdir. Russell managed to isolate > > > & describe test case for reproducing corruption when doing a power cut > > > after first boot. > > > > > > (...) > > > > > > Can I ask you to check if there is something possibly wrong with the > > > above ovl commit? Or does it expose some problem with the ubifs? Or > > > maybe the whole UBI? > > > > > > FWIW testing above commit (and one before it) always results in single > > > error in the kernel log: > > > [ 14.250184] UBIFS error (ubi0:1 pid 637): ubifs_add_orphan: orphaned twice > > > > > > That UBIFS error doesn't occur with 4.12.14. Unfortunately it's > > > impossible to cleanly revert 3a1e819b4e80 from the top of 4.12.14. > > > > Let me provide a summary of all relevant commits & tests: > > > > By "Corruption" I mean file system corruption after power cut > > Well, is the filesystem not consistent anymore? > From what Russel explained to me, I thought the main problem is that no write back happens. > IOW the inode is present, has correct length, but no content is there (all zeros). > > Just like the typical case where userspace does not fsync. > But in your case sooner or later write back should have happened because the writeback timer > fires at some point. > For the records overlayfs does: - open(O_TMPFILE) - setxattr() [with 3a1e819b4e80] - write to tmpfile - fsync tmpfile - link tmpfile I suggest that you try the same from user space on ubifs. Thanks, Amir.