On Fri, Aug 12, 2022 at 10:33:20AM +0800, Baokun Li wrote: > 在 2022/8/5 22:00, Luís Henriques 写道: ... > > This bug is easily reproducible using the filesystem image provided -- > > it's just a matter of mounting it and run: > > > > $ cat /mnt/foo/bar/xattr > > Hi Luís, > yeah, that's a good catch! > > Anyway, I hope my analysis of the bug is correct -- the root cause seems > > to be an extent header with an invalid value for in eh_entries, which will > > later cause the BUG_ON(). > > > > Cheers, > > -- > > Luís > But there's a little bit of a deviation in your understanding of the > problem, > so the patch doesn't look good. > The issue is caused by the contradiction between eh_entries and eh_depth. Ah! This makes a lot of sense and I can confirm this is exactly what happens in both bugzilla images. Thanks a lot for your feedback! > Therefore, we need to check the contradiction instead of adding a judgment > to ext4_ext_binsearch_idx. > So the right fix is to add a check to __ext4_ext_check like: > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > index c148bb97b527..2dfd35f727cb 100644 > --- a/fs/ext4/extents.c > +++ b/fs/ext4/extents.c > @@ -460,6 +460,10 @@ static int __ext4_ext_check(const char *function, > unsigned int line, > error_msg = "invalid eh_entries"; > goto corrupted; > } > + if (unlikely((eh->eh_entries == 0) && (depth > 0))) { > + error_msg = "contradictory eh_entries and eh_depth"; > + goto corrupted; > + } > if (!ext4_valid_extent_entries(inode, eh, lblk, &pblk, depth)) { > error_msg = "invalid extent entries"; > goto corrupted; > > In this way, we can fix this issue and check for header exceptions before > calling ext4_ext_binsearch_idx. Awesome, I'll send out v2 with the suggested change. It makes sense to have this check and it should fix both bugs. On the other hand, I still wonder wether the extra check in my original patch is correct or not. I spent a good amount of time trying to find out if eh_entries can be 0 at that point (in ext4_ext_binsearch_idx()) and couldn't find a situation where it could. And running the fstests with that check didn't show any problem. But yeah, my understanding of the whole code is far from perfect. Cheers, -- Luís