Re: help channel for md raid?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/08/21 15:38, Stephan Böttcher wrote:
> 
> Moin!
> 
> May I ask a question on this list, or is this strictly for development?

Ask away!
> 
> Thanks,
> Stephan
> 
> 
> My qustion is how to translate sector numbers from a RAID6 as in
> 
>> Aug  2 01:32:28 falbala kernel: md8: mismatch sector in range 1460408-1460416
> 
> to ext4 inode numbers, as in
> 
>> debugfs: icheck 730227 730355 730483 730611
>> Block   Inode number
>> 730227  30474245
>> 730355  30474245
>> 730483  30474245
>> 730611  30474245
> 
> Is there a list, channel, matrix room for this kind of questions?
> Are there tools to do what I need?
> Is the approach below sensible?
> 
> It is a RAID6 with six drives, one of them failed.
> A check yielded 378 such mismatches.
> 
> I assume the sectors count from the start of the `Data Offset`.
> `ext4 block numbers` count from the start of the partition?
> Is that correct?

md8 is your array. This is the block device presented to your file
system so you're feeding it 730227, 730355, 730483, 730611, these are
the sector numbers of md8, and they will be *linear* within it.

So if you're trying to map filesystem sectors to md8 sectors, they are
the same thing.

Only if you're trying to map filesystem sectors to the hard drives
underlying the raid do you need to worry about the 'Data Offset'. (And
this varies on a per-drive basis!)
> 
> The failed drive has >3000 unreadble sectors and became very slow.

So you've removed the drive? Have you replaced it? If you have a drive
fail again, always try and replace it using the --replace option, I
think it's too late for that now ...

But as for finding out which files may have been corrupted, you want to
use the tools that come with the filesystem, and ask it which files use
sectors 1460408-1460416. Don't worry about the underlying raid.
Hopefully those tools will come back and say those sectors are unused.
If they ARE used, the chances are it's just the parity which is corrupt.
Otherwise you're looking at a backup ...

So I think what you need to do is (1) find out which files use those
sectors, (2) replace that missing drive asap, and (3) check the
integrity of the file system with fsck. Then (4) do a repair scrub.

(I gather such errors are reasonably common, and do not signify a
problem. On a live filesystem they could well be a collision between a
file write and the check ...)

Cheers,
Wol



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux