Hi Neil,
Thanks for the very informative answer, that nailed it, and Ethan
was obviously onto it too!
I tried running the commands you posted and it gives me an error:
bb.sh
"
sed 's/^/-/' /sys/block/md0/md/dev-sdq/bad_blocks
while read; do
echo $a > /sys/block/md0/md/dev-sdq/bad_blocks
done
"
root@nas3:~# ./bb.sh
-211996592 128
./bb.sh: line 3: echo: write error: Invalid argument
"
Can you help me with this?
I will clear all the bad blocks on all the drives and force a
repair and see if some error shows up. If not, I will then fsck the
filesystem.
I'm not sure how the volume failed. On one friday morning ( past
month ) I checked the system and everything was ok ( no dmesg errors
and mdastat repoted all disks up ). next monday I got a call telling
me that the volume was inacessible. When I got back the next thursday,
the machine had already been rebooted and the md0 volume had three
failed disks. I did a --examine and two of them were completly off in
terms of events regarding the non-failed disks. the other one was much
more close, but still a bit off. Not close enough to do a --assemble
--force, so I recreated the array with something like this:
"mdadm --create --assume-clean --level=6 --raid-devices=16
--name=nas3:Datastore --uuid=9e97c588:59135324:c7d3fdf6:e543bdc3
/dev/md0 /dev/sde /dev/sdc /dev/sdd /dev/sdb /dev/sdf /dev/sdg
/dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl missing /dev/sdn missing
/dev/sdp /dev/sdq". I think the last failed drive was sdl or sdn,
can't remember.
then I cleared the superblocks on the missing disks and readded them.
Then I fsck'd the filesystem and I started getting those errors. I
since then replaced them with new disks and tested the old ones only
to find that they have no smart errors reported ( smart is enabled in
bios ) and I also did a read-write test to them and I found them to be
ok.
I will rebuild this machine next weekend, or the one after that, to
try to sort out some hardware problem or issues with the cabling, but
I am inclined to say that maybe it's related to the sshd's.
Cheers
Pedro
Citando NeilBrown <neilb@xxxxxxx>:
On Wed, 02 Jul 2014 12:54:34 +0100 Pedro Teixeira <finas@xxxxxxxx> wrote:
cpu is a phenom x6, 8gb ram. controller is LSI 9201-i16. hdd's are
seagate sshd ST1000DX001.
So I run the "dd if=/dev/md0 of=/dev/null bs=4096" and it failed on
alot of places. I had to restart the command several times with the
skip parameter set to a couple of blocks after the last block error.
It run for about 1.5TB of the total 13TB of the volume.
The md volume didn't drop any drive when running this.
dmesg showed:
[ 1678.478156] Buffer I/O error on device md0, logical block 196012546
I love numbers, thanks.
The logical block size is 4096, or 8 sectors (1 sector is defined as 512
bytes), so this is at
196012546*8 == 1568100368 sectors into the array.
The array has a chunksize of 512K, or 1024 sectors so
196012546*8/1024 = 1531348.015625
gives us the chunk number, and the remaining fraction of a chunk.
The RAID6 has 16 devices, so there are 14 data chunks in each stripe, so to
find where the above chunk is stored we divide by 14
1531348/14 = 109382.0000
So that is chunk 109382 on the first device (though with rotating data,
it might not be the very first).
Add back in the factional part, multiple by 1024 sectors per chunk, and add
the Data Offset,
109382.01562500*1024+262144 = 112269328
So it seems that sector 112269328 on some device is bad.
The command "mdadm --examine-badblocks /dev/sd[bcdefghijklmnopqr] >>
raid.b" before and after running the "dd" command returned no changes:
I didn't notice the fact that the bad block logs were not empty
before, sorry.
Anyway:... > Bad-blocks on /dev/sdb:
112269328 for 512 sectors
Look at that - exactly the number I calculated. I love it when that works
out.
So the problem is exactly that some blocks are thought by md to be bad.
Blocks get recorded as bad (for raid6) when:
- a 'read' reported an error which could not be fixed, either
because the array was degraded so the data could not be recovered,
or because the attempt to write restored data failed
- when recovering a spare, if the data to be written cannot be
found (due to
errors on other devices)
- when a 'write' request to a device fails
When your array had three failed devices, some reads and writes would have
failed. Maybe that caused the bad blocks to be recorded.
What sort of devices failures where they? If the device became completely
inaccessible, then it would not have been possible to record the bad block
information.
Can you describe the sequence of events that lead to the three failures?
When you put the array back together, did you --create it, or --assemble
--force?
There isn't an easy way to remove the bad block list, as doing so
is normally
asking for data corruption.
However it is probably justified in your case.
As it happens I included code in the kernel to make it possible to
remove bad
blocks from the list - it was intended for testing only but I never removed
it.
If you run
sed 's/^/-/' /sys/block/md0/md/dev-sdq/bad_blocks |
while read; do
echo $a > /sys/block/md0/md/dev-sdq/bad_blocks
done
then it should clear all of the bad blocks recorded on sdq.
You should probably fail/remove the last two devices that you added to the
array before you do this, as they probably don't have properly uptodate
information and doing this will cause corruption.
I probably need to think about better ways to handle the bad block lists.
NeilBrown
________________________________________________________________________________
Mensagem enviada através do email grátis AEIOU
http://www.aeiou.pt
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html