[no subject]

**Date** **Thread**

 Thanks a lot for all your help, Ted. Appreciate if you could
prioritize the fix.

On Tue, Mar 29, 2022 at 6:38 PM Theodore Ts'o <tytso@xxxxxxx> wrote:
>
> (Removing linux-fsdevel from the cc list since this is an ext4
> specific issue.)
>
> On Mon, Mar 28, 2022 at 09:38:18PM +0530, Fariya F wrote:
> > Hi Ted,
> >
> > Thanks for the response. Really appreciate it. Some questions:
> >
> > a) This issue is observed on one of the customer board and hence a fix
> > is a must for us or at least I will need to do a work-around so other
> > customer boards do not face this issue. As I mentioned my script
> > relies on df -h output of used percentage. In the case of the board
> > reporting 16Z of used space and size, the available space is somehow
> > reported correctly. Should my script rely on available space and not
> > on the used space% output of df. Will that be a reliable work-around?
> > Do you see any issue in using the partition from then or some where
> > down the line the overhead blocks number would create a problem and my
> > partition would end up misbehaving or any sort of data loss could
> > occur? Data loss would be a concern for us. Please guide.
>
> I'm guessing that the problem was caused by a bit-flip in the
> superblock, so it was just a matter of hardware error.  What version
> of e2fsprogs are using, and did you have metadata checksum (meta_csum)
> feature enabled?  Depending on where the bit-flip happened --- e.g.,
> whether it was in memory and then superblock was written out, or on
> the eMMC or other storage device --- if the metadata checksum feature
> caught the superblock error, it would have detected the issue, and
> while it would have required a manual fsck to fix it, at that point it
> would have fallen back to use the backup superblock version.
>
> > b) Any other suggestions of a work-around so even if the overhead
> > blocks reports more blocks than actual blocks on the partition, i am
> > able to use the partition reliably or do you think it would be a
> > better suggestion to wait for the fix in e2fsprogs?
> >
> > I think apart from the fix in e2fsprogs tool, a kernel fix is also
> > required, wherein it performs check that the overhead blocks should
> > not be greater than the actual blocks on the partition.
>
> Yes, we can certainly have the kernel check to see if the overhead
> value is completely insane, and if so, recalculate it (even though it
> would slow down the mount).
>
> Another thing we could do is to always recaluclate the overhead amount
> if the file system is smaller than some arbitrary size, on the theory
> that (a) for small file systems, the increased time to mount the file
> system will not be noticeable, and (b) embedded and mobile devices are
> often where "cost optimized" (my polite way of saying crappy quality
> to save a pentty or two in Bill of Materials costs) are most likely,
> and so those are where bit flips are more likely.
>
> Cheers,
>
>                                                 - Ted