On Thu, Jan 16, 2025 at 03:56:57PM +0100, Jan Kara wrote: ... > > > > Interesting. Is there some endianness issue, by any chance ? I only see the problem > > with sheb (big endian), not with sh (little endian). I'd suspect that it is an > > emulation bug, but it is odd that the problem did not show up before. > > So far I don't have a good explanation. Let me write down here the facts, > maybe it will trigger the aha effect. > > 1) Ext2 stores the metadata in little endian ordering. We observe the > problem with the first directory entry in the folio. Both entry->rec_len > (16-bit) and entry->inode (32-bit) appear to be seen in wrong endianity > > 2) The function that fails is ext2_check_folio(). We kmap_local() the folio > in ext2_get_folio(), then in ext2_check_folio() we do: > > ext2_dirent *p; > > p = (ext2_dirent *)(kaddr + 0); > rec_len = ext2_rec_len_from_disk(p->rec_len); > ^^^ value 3072 == 0x0c00 seen here instead of correct 0x000c > this value is invalid so we go to: > ext2_error(sb, __func__, "bad entry in directory #%lu: : %s - " > "offset=%llu, inode=%lu, rec_len=%d, name_len=%d", > dir->i_ino, error, folio_pos(folio) + offs, > (unsigned long) le32_to_cpu(p->inode), > rec_len, p->name_len); > > Here rec_len is printed so we see the wrong value. Also > le32_to_cpu(p->inode) is printed which also shows number with swapped byte > ordering (the message contains inode number 27393 == 0x00006b01 but the > proper inode number is 363 == 0x0000016b). This actually releals more about > the problem because only the two bytes were swapped in the inode number > although we always treat it as 32-bit entity. So this would indeed point > more at some architectural issue rather than a problem in the filesystem / > MM. > > Note that to get at this point in the boot we must have correctly > byteswapped many other directory entries in the filesystem. So the problem > must be triggered by some parallel activity happening in the system or > something like that. > > 3) The problem appears only with MTD storage, not with IDE/SATA on the same > system + filesystem image. It it unclear how the storage influences the > reproduction, rather than that it influences timing of events in the > system. > > 4) The problem reliably happens with "mm/page-writeback: Consolidate wb_thresh > bumping logic into __wb_calc_thresh", not without it. All this patch does > is that it possibly changes a limit at which processes dirtying pages in > the page cache get throttled. Thus there are fairly limited opportunities > for how it can cause damage (I've checked for possible UAF issues or memory > corruption but I don't really see any such possibility there, it is just > crunching numbers from the mm counters and takes decision based on the > result). This change doesn't have direct on the directory ext2 code. The only > thing it does is that it possibly changes code alignment of ext2 code if it > gets linked afterwards into vmlinux image (provided ext2 is built in). Another > possibility is again that it changes timing of events in the system due to > differences in throttling of processes dirtying page cache. > > So at this point I don't have a better explanation than blame the HW. What > really tipped my conviction in this direction is the 16-bit byteswap in a > 32-bit entity. Hence I guess I'll ask Andrew to put Jim's patch back into > tree if you don't object. I agree that this is most likely an emulation problem (it would not be the first one), so please go ahead. Thanks, Guenter