Re: request for help on IMSM-metadata RAID-5 array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

在 2023/09/24 2:49, Joel Parthemore 写道:
So, dd finally sped up and finished. It appears that I have lost none of my data. I am a very happy man. A question: is there anything useful I am likely to discover from keeping the RAID array as it is a bit longer before I recreate it and copy the data back?

It'll be much helper for developers to collect kernel stack for all
stuck thread(and it'll be much better to use add2line).

Thanks,
Kuai


Joel

-----------------------------------------------------------------------------

I have been wondering about HDD issues all along, of course, though I didn't see any smoking gun.


I ran iostat -x 2 /dev/sdX on all three drives. All show an idle rate of just under 90%. So I don't think that's the problem.

Joel

Den 2023-09-23 kl. 17:35, skrev Roman Mamedov:
On Sat, 23 Sep 2023 17:18:00 +0200
Joel Parthemore <joel@xxxxxxxxxxxxxxx> wrote:

I didn't want to try that again until I had confirmation that the
out-of-sync wouldn't (or shouldn't) be an issue. (I had tried it once
before, but the system had somehow swapped /dev/md126 and /dev/md127 so
that /dev/md126 became the container and /dev/md127 the RAID-5 array,
which confused me. So I stopped experimenting further until I had a
chance to write to the list.)

The array is assembled read only, and this time both /dev/md126 and
/dev/md127 are looking like I expect them to. I started dd to make a
backup image using dd if=/dev/md126 of=/dev/sdc bs=64K
conv=noerror,sync. (The EXT4 file store on the 2TB RAID-5 array is about
900GB full.) At first, it was running most of the time and just
occasionally in uninterruptible sleep, but the periods of
uninterruptible sleep quickly started getting longer. Now it seems to be
spending most but not quite all of its time in uninterruptible sleep. Is
this some kind of race condition? Anyway, I'll leave it running
overnight to see if it completes.

Accessing the RAID array definitely isn't locking things up this time. I
can go in and look at the partition table, for example, no problem.
Access is awfully slow, but I assume that's because of whatever dd is or
isn't doing.

By the way, I'm using kernel 6.5.3, which isn't the latest (that would
be 6.5.5) but is close.
Maybe it's an HDD issue, one of them did have some unreadable sectors in the past, although the firmware has not decided to do anything about that, such
as reallocating them and recording that in SMART.

Check if one of the drives is holding up things, with a command like

iostat -x 2 /dev/sd?

If you see 100% next to one of the drives, and much less for others, that one
might be culprit.
.





[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux