Re: e2fsck dir_info corruption

Andreas Dilger <adilger@xxxxxxxxx> · Sat, 1 Feb 2020 10:02:00 -0700

On Jan 31, 2020, at 9:07 PM, Andreas Dilger <adilger@xxxxxxxxx> wrote:
> 
> I've been trying to track down a failure with e2fsck on a large filesystem.  Running e2fsck-1.45.2 repeatedly and consistently reports an error in pass2:
> 
> Internal error: couldn't find dir_info for <ino>

[snip]

> The dir_info array itself is over 4GB in size (not sure if this is relevant
> or not), with 380M directories, but the bad inode is only about half way
> through the array ($140 is the index of the problematic entry):

[snip]

> The watchpoint triggered, and saw that the entry was changed by qsort_r(),
> which at first I thought "OK, the dir_info array needs to be sorted, because
> a binary search is used on it", but in fact the array is *created* in order
> during the inode table scan and does not need to be sorted.  As can be seen
> from the stack, it is *another* array that is being sorted that overwrites
> the dir_info entry:

[snip]

> AFAIK, the ext2fs_dblist_sort2() is for the directory *blocks*, and should
> not be changing the dir_info at all.  Is this a bug in qsort or glibc?
> 
> What I just noticed writing this email is that the fs->dblist.list address
> is right in the middle of the dir_info array address range:
> 
>    (gdb) p *fs->dblist
>    $210 = {magic = 2133571340, fs = 0x6c4460,
>      size = 763079922, count = 388821313, sorted = 0,
>      list = 0x7ffad011e010}
>    (gdb) p &ctx->dir_info->array[0]
>    $211 = (struct dir_info *) 0x7ffabf2bd010
>    (gdb) p &ctx->dir_info->array[$140]
>    $212 = (struct dir_info *) 0x7ffb3d327f54
> 
> which might explain why sorting dblist is messing with dir_info?  I don't
> _think_ it is a problem with my build or swap, which is different from
> the system that this was originally reproduced on.

Just like any good mystery, the information I needed was there all along.

After abandoning the previous e2fsck-under-gdb run (which took me a couple
of days to get into the right state, so wasn't done lightly), I restarted
and was tracking the dblist and dir_info allocations, thinking that I might
catch when they became "bad" to due realloc() or similar.  In fact, these
allocations were bad from the beginning (similar to what was shown above,
with dblist in the middle of what should be the dir_info array).  This is
due to calls to e2fsck_allocate_memory() overflowing 4GB from the use of
"unsigned int size" as the argument.  That dooms the allocations from the
beginning, though it is very surprising that there wasn't massive corruption
visible earlier in the test run...

I think there are two options to fix this:
- change the e2fsck_allocate_memory() argument to "unsigned long size", and
  fix all of the callers to typecast to unsigned long before calling, but
  this is error prone if something is missed or a new allocation is added
- add a new e2fsck_allocate_array() function that takes size an count as
  arguments and does the calculation internally, which I think is more robust

Patches forthcoming, after I have verified that they are working (this may
take a few days due to the lengthy runtime for this filesystem).

Cheers, Andreas

Attachment:
signature.asc

Description: Message signed with OpenPGP