Re: repair: realloc(): invalid next size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/8/18 9:26 AM, Eric Sandeen wrote:
> 
> 
> On 10/8/18 9:03 AM, Arkadiusz Miśkiewicz wrote:
>>
>> Big fs, ton of small files, repair takes 36h until this happens:
>>
>> rebuilding directory inode 30363993060
>> rebuilding directory inode 30398868604
>> rebuilding directory inode 30414474627
>> rebuilding directory inode 30425006954
>> rebuilding directory inode 30447937553
>> rebuilding directory inode 30529556616
>> rebuilding directory inode 30537494728
>> rebuilding directory inode 30569826838
>> rebuilding directory inode 31060721895
>> Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
>> xfs_repair: warning - iflush_int failed (-117)
>> Warning: recursive buffer locking at block 31060721776 detected
>> Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
>> xfs_repair: warning - iflush_int failed (-117)
>> Warning: recursive buffer locking at block 31060721776 detected
>> Metadata corruption detected at 0x41f980, inode 0x73b5d00e7 data fork
>> xfs_repair: warning - iflush_int failed (-117)
>> realloc(): invalid next size
>> Aborted
>>
>>
>> Fails somewhere in 0x41f9db <xfs_dir2_sf_verify+603>
>>
>> Complete log at
>> https://ixion.pld-linux.org/~arekm/xfs-1/repair.txt
>>
>> Test was done with xfs_repair 4.17.0 and 4.18.0 with the same result.
>>
>> kernel 4.18.5
>>
>> Running under gdb now.
>>
>> Any ideas?
> 
> With such a big fs it's tough to share a metadump for a reproducer, I assume.
> 
> The earlier write verifiers failing for xfs_repair writes are troubling...
> 
> I'm not certain why it's rebuilding so many dir inodes; there are several cases
> where that happens, but unfortunately repair doesn't always say which one or why.
> 
> Anyway, you eventually get to this inode (it's the same in decimal & hex
> below):
> 
> rebuilding directory inode 360732305
> Metadata corruption detected at 0x41f9db, inode 0x15805691 data fork
> xfs_repair: warning - iflush_int failed (-117)
> 
> with lots of corruption during the writes, and this happens for a couple
> other inodes, until finally:
> 
> rebuilding directory inode 31060721895
> Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
> 
> and this one ends up aborting in glibc's realloc():
> 
> realloc(): invalid next size
> 
> I /think/ that this indicates that memory has been corrupted during the repair
> run.  :/  Running under valgrind would probably lead to a 72hr runtime or more :)
> 
> I wonder if it would save time in the long run to make a metadump and remove all
> directory trees other than this inode (360732305) from it, and see if the same
> failure occurs when running on the reduced fs image?

Actually if you try that, also leaving the directory trees in place for the other
inodes that reported issues would make sense.

-Eric



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux