On Mon 13-07-20 19:14:47, Ritesh Harjani wrote: > > > On 7/13/20 6:28 PM, Jan Kara wrote: > > From: Wolfgang Frisch <wolfgang.frisch@xxxxxxxx> > > > > When extent tree is corrupted we can hit BUG_ON in > > ext4_es_cache_extent(). Check for this and abort caching instead of > > crashing the machine. > > Was it intentionally made corrupted by crafting a corrupted disk image? I'm not sure how Wolfgang hit the issue. I'd expect some fs image fuzzing... Wolfgang? > Are there more such logic in place which checks for such corruption at other > places? That's a good question. But now that I'm looking at it ext4_ext_check() should actually catch a corruption like this. It is only the path in ext4_find_extent()->ext4_cache_extents() that can face the issue so probably instead of a fix in ext4_cache_extents() we should rather add more careful extent info checks for the extents contained directly in the inode. I'll look into it. > Maybe a background over the issue which you saw may help. > Also how did it recover out of it? e2fsck I suppose :) > Do you think it make sense to still emit a WARN_ON() here and then > return which warns that this could possibly a corrupted extent > entry? (maybe WARN_ON_ONCE() or via some ratelimiting if multiple extent > entries are corrupted for that inode). No, WARN is definitely wrong in this case. We could call ext4_error() if we wanted. That would make sence although I've decided not to add it to the original Wolfgang's fix since this is more like a failing readahead. But OTOH it's metadata corruption that's unlikely to go away so I can be easily convinced to put ext4_error() there :). Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR