Re: Power cut leads to "corrupt empty space"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Timo,

On Thu, Feb 27, 2020 at 2:04 PM Timo Ketola <Timo.Ketola@xxxxxxxxxx> wrote:
> We have a few i.MX6D devices which have corrupted their UBIFS filesystem
> on power cut and refuse to mount them any more.
>
> The log says:
>
> > [   10.382580] UBIFS (ubi1:0): background thread "ubifs_bgt1_0" started, PID 158
> > [   10.408838] UBIFS (ubi1:0): recovery needed
> > [   10.802070] UBIFS error (ubi1:0 pid 157): ubifs_scan: corrupt empty space at
> > LEB 99:114688
> > [   10.809054] UBIFS error (ubi1:0 pid 157): ubifs_scanned_corruption: corruptio
> > n at LEB 99:114688
> > [   10.816471] UBIFS error (ubi1:0 pid 157): ubifs_scanned_corruption: first 819
> > 2 bytes from LEB 99:114688
> > [   10.824585] 00000000: 06101831 713b7e1b 002e0640 00000000 000000a0 00000200 0
> > 0000554 00000000  1....~;q@...............T.......
> > [   10.824601] 00000020: 00000000 00000000 0001585b 00000000 0008c48d 00000000 5
> > d512897 00000000  ........[X...............(Q]....
>
> ...
>
> > [   10.827751] UBIFS error (ubi1:0 pid 157): ubifs_scan: LEB 99 scanning failed
> > [   10.834615] UBIFS (ubi1:0): background thread "ubifs_bgt1_0" stops
>
> I think I found the culprit from the mtdblock contents. Fragment from
> hexdump:
>
> > 3ca20000  55 42 49 23 01 00 00 00  00 00 00 00 00 00 00 04  |UBI#............|
> > 3ca20010  00 00 08 00 00 00 10 00  0c 4d 7c ed 00 00 00 00  |.........M|.....|
> > 3ca20020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 3ca20030  00 00 00 00 00 00 00 00  00 00 00 00 cb 5d 1f 01  |.............]..|
> > 3ca20040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 3ca20800  55 42 49 21 01 01 00 00  00 00 00 00 00 00 00 63  |UBI!...........c|
> > 3ca20810  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 3ca20820  00 00 00 00 00 00 00 00  00 00 00 00 00 00 8d 07  |................|
> > 3ca20830  00 00 00 00 00 00 00 00  00 00 00 00 91 2b 87 87  |.............+..|
> > 3ca20840  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 3ca21000  31 18 10 06 30 3c 6d 96  cd 05 2e 00 00 00 00 00  |1...0<m.........|
> > 3ca21010  a0 00 00 00 00 02 00 00  54 05 00 00 00 00 00 00  |........T.......|
>
> ...
>
> > 3ca3b8c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 3ca3c000  31 18 10 06 7b 71 87 8f  3c 06 2e 00 00 00 00 00  |1...{q..<.......|
> > 3ca3c010  a0 00 00 00 00 02 00 00  54 05 00 00 00 00 00 00  |........T.......|
> > 3ca3c020  00 00 00 00 00 00 00 00  5b 58 01 00 00 00 00 00  |........[X......|
> > 3ca3c030  79 c3 08 00 00 00 00 00  97 28 51 5d 00 00 00 00  |y........(Q]....|
> > 3ca3c040  19 58 6d 38 00 00 00 00  19 58 6d 38 00 00 00 00  |.Xm8.....Xm8....|
> > 3ca3c050  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
> > 3ca3c060  eb 03 00 00 eb 03 00 00  a4 81 00 00 01 00 00 00  |................|
> > 3ca3c070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 3ca3c080  00 00 00 00 01 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 3ca3c090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 3ca3c0a0  31 18 10 06 84 13 e1 a0  00 00 00 00 00 00 00 00  |1...............|
> > 3ca3c0b0  1c 00 00 00 05 00 00 00  44 07 00 00 00 00 00 00  |........D.......|
> > 3ca3c0c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 3ca3c800  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
> > *
> > 3ca3d000  31 18 10 06 1b 7e 3b 71  40 06 2e 00 00 00 00 00  |1....~;q@.......|

So, in there is a whole 2KiB area 0xFF.
It is also aligned, so it could be whole page.

> > 3ca3d010  a0 00 00 00 00 02 00 00  54 05 00 00 00 00 00 00  |........T.......|
> > 3ca3d020  00 00 00 00 00 00 00 00  5b 58 01 00 00 00 00 00  |........[X......|
> > 3ca3d030  8d c4 08 00 00 00 00 00  97 28 51 5d 00 00 00 00  |.........(Q]....|
> > 3ca3d040  19 58 6d 38 00 00 00 00  19 58 6d 38 00 00 00 00  |.Xm8.....Xm8....|
> > 3ca3d050  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
> > 3ca3d060  eb 03 00 00 eb 03 00 00  a4 81 00 00 01 00 00 00  |................|
> > 3ca3d070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 3ca3d080  00 00 00 00 01 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 3ca3d090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 3ca3d0a0  31 18 10 06 84 13 e1 a0  00 00 00 00 00 00 00 00  |1...............|
> > 3ca3d0b0  1c 00 00 00 05 00 00 00  44 07 00 00 00 00 00 00  |........D.......|
> > 3ca3d0c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 3ca3d800  31 18 10 06 c1 6b e6 57  42 06 2e 00 00 00 00 00  |1....k.WB.......|
> > 3ca3d810  a0 00 00 00 00 02 00 00  54 05 00 00 00 00 00 00  |........T.......|
> > 3ca3d820  00 00 00 00 00 00 00 00  5b 58 01 00 00 00 00 00  |........[X......|
> > 3ca3d830  0d c5 08 00 00 00 00 00  97 28 51 5d 00 00 00 00  |.........(Q]....|
> > 3ca3d840  19 58 6d 38 00 00 00 00  19 58 6d 38 00 00 00 00  |.Xm8.....Xm8....|
> > 3ca3d850  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
> > 3ca3d860  eb 03 00 00 eb 03 00 00  a4 81 00 00 01 00 00 00  |................|
> > 3ca3d870  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 3ca3d880  00 00 00 00 01 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 3ca3d890  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > 3ca3d8a0  31 18 10 06 84 13 e1 a0  00 00 00 00 00 00 00 00  |1...............|
> > 3ca3d8b0  1c 00 00 00 05 00 00 00  44 07 00 00 00 00 00 00  |........D.......|
> > 3ca3d8c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> > *
> > 3ca3e000  31 18 10 06 0b 75 3d 9e  44 06 2e 00 00 00 00 00  |1....u=.D.......|
>
> IIUC, ubifs_scan finds empty space at 3ca3c800, stops scanning and
> checks the rest of the LEB for being empty but finds something else at
> 3ca3d000. Then recovery aborts and mounting fails.
>
> Do I understand correctly that empty space should always be continuous
> at the end of the LEB?

Correct.

> How could this kind of corruption happen?

Hard to say. Maybe bad timing settings which cause writes to have no effect.
But usually this leads to ECC errors.
If you can share the image with me I can have a look and with some luck we
find traces.

Is this a mainline kernel?
Wonky drivers can lead to all kind of "interesting" results. :->

> Is there any way to recover from this?

Not really. UBIFS' IO model got violated and it gives up.

> Storage is NAND with 0x20000 erase block size and the kernel is 4.9.88.

I guess 2KiB page size?

-- 
Thanks,
//richard

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/



[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux