Re: Nandsim, UBIFS and memory concerns

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Steve,

Le ven. 5 oct. 2018 à 00:43, Steve deRosier <derosier@xxxxxxxxx> a écrit :
>
> Hi Romain,
>
> On Thu, Oct 4, 2018 at 9:53 AM Romain Izard <romain.izard.pro@xxxxxxxxx> wrote:
> >
> > On a regular but slow basis, I get report of devices based on UBIFS running
> > Linux 4.14 where the file system gets corrupted during an update. The update
> > process creates new files with temporary names to replace existing files,
> > and uses renames to replace these files atomically. What is observed is that
> > in some cases, the update log describes all steps for a complete update, and
> > yet some files contain the new version while others contain an older
> > version. Moreover, it seems that some files with temporary names that should
> > have been renamed are visible.
> >
> > As the update process is also able to use tmpfs to create files, and will
> > use a large part of the available memory, I fear that this issue is related
> > with the behaviour of UBIFS in low memory conditions. I'm wondering about
> > UBIFS losing some parts of the log when a ENOMEM condition occurs during its
> > operations or when the OOM killer targets a process that is doing some UBIFS
> > processing.
> >
>
> I've seen these sort of symptoms that you describe in the wild. But
> what I have seen has never had anything to do with UBIFS, but only
> with problems with how updates (or other large filesystems operations)
> are implemented. Specifically, the lack of a filesystem sync before a
> reboot will have these exact effects. What you end up with is a
> situation where the filesystem operations are done, yet the changes
> haven't actually been flushed to "disk".  Doesn't mater if it's a HDD
> or a UBIFS on flash, the effect is the same, though the time of
> vulnerability might be different.
>
> Especially since you mention the OOM killer and using tmpfs - I'd look
> into if you're running out of RAM, and either causing an reboot oops
> or at least killing the process before all file operations are
> complete. Just because your log shows the operation was triggered at
> the userspace level, doesn't mean the kernel has completed all
> filesystem operations and written the physical device.
>
> What you describe is not an UBIFS corruption, but a garden-variety
> user-space file operations corruption issue.
>
> As I said, I've encountered this before. The only thing you can do is
> to examine your process and tailor it to be sure to complete it's
> physical writes.  In our case, we had a few things to solve: * put
> 'sync' calls in our update scripts, * avoid the use of a problematic
> utility, and * we tried using the `-osync` flag.  (-osync fixed the
> problem at the cost of a performance hit. Later we decided not to go
> that way and instead instructed our customers how to properly write
> programs that wrote the filesystem).

Thank you for sharing your experience on this topic. This will help
me to concentrate on checking my own code, rather than spending
time to analyse something that works.

Best regards,
-- 
Romain Izard

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/




[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux