RE: Troubles making a raid5 system work.

"Francisco Zafra" <fzafra@xxxxxxxxx> · Wed, 1 Jun 2005 19:51:19 +0200

Thanks Molle,

Finally I made the raid system work fine :-) I followed your steps, I it
worked... That exactly what I did:
- I applied the patch
md-make-raid5-and-raid6-robust-against-failure-during-recovery.patch to my
kernel.
- dd the all the hardisk erasing superblock info and all...
- Create again the array from 0.

I checked the logs and all seems to be right.

Thanks again.

By the way... I have two questions.
1.- This patch will be included in new kernels versions or I have to applied
each time I compile a new kernel version?
2.- Working with big files (700megs) in the RAID comsumes a lot of cpu
resources, is this normal? I have an Pentium 4, 3Ghz and 1GB RAM...

That's all.

> Francisco Zafra wrote:
> >  I have 8 200GB new SATA HDs, mdadm v1.9.0 and kernel 2.6.11.8.
> 
> > When the create command finish proc/mdstats report the following:
> >         md0 : active raid5 sda1[0] sdh1[8] sdg1[6] sdf1[5] sde1[4] 
> > sdd1[3]
> > sdc1[9](F) sdb1[1]
> >         1367507456 blocks level 5, 256k chunk, algorithm 2 [8/6] 
> > [UU_UUUU_]
> 
> Odd that there's two missing disks in [UU_UUUU_], but only 
> one F marker on the above line.
> 
> > mdadm --detail, an obtained this:
> > /dev/md0:
> >         Version : 00.90.01
> >   Creation Time : Tue May 24 20:02:28 2005
> >      Raid Level : raid5
> >      Array Size : 1367507456 (1304.16 GiB 1400.33 GB)
> >     Device Size : 195358208 (186.31 GiB 200.05 GB)
> >    Raid Devices : 8
> >   Total Devices : 8
> > Preferred Minor : 0
> >     Persistence : Superblock is persistent
> > 
> >     Update Time : Sun May 29 17:29:45 2005
> >           State : clean, degraded
> >  Active Devices : 6
> > Working Devices : 7
> >  Failed Devices : 1
> >   Spare Devices : 1
> 
> Oh, so that's why there's a missing F.
> 
> MD has assigned one of the disks to be a Spare device, even 
> though you did not specify any spares on the mdadm command 
> line or in the .conf file.
> 
> No clue why, but seems wrong!!
> 
> >        8       8      113        7      spare rebuilding   /dev/sdh1
> 
> MD's trying to rebuild with the spare.
> 
> >        9       8       33        -      faulty   /dev/sdc1
> 
> Doesn't look good.
> 
> > in the system logs I have thousands of messages like this, that not 
> > were generating during the create command:
> 
> [snip repeated sync start/done messages]
> 
> I had the same problem.
> There was once a bug in MD that caused this when syncing + 
> multiple devices fail.
> See this thread for details:
> 
> http://thread.gmane.org/gmane.linux.raid/7714
> 
> (Ignore everything from Patrik Jonsson / "toy array" and 
> onwards, it's just someone that doesn't know how their mailer 
> works - shouldn't have been part of the thread)
> 
> > # mdadm -R /dev/md0
> > mdadm: failed to run array /dev/md0: Invalid argument
> 
> Hm, could be a bug, or maybe it's just a misleading error message.
> 
> I wouldn't expect anyone to be able to figure out what's 
> going on from the two words "Invalid argument", so if it can 
> be fixed, this should definitely say something a little more 
> informational.
> 
> > I have tried this several times, I have even earsed and 
> checked each 
> > drive
> > with:
> > 
> >         mdadm --zero-superblock /dev/sdd
> >         dd if=/dev/sdd of=/dev/null bs=1024k
> >         badblocks -svw /dev/sdd
> 
> Perhaps there is a more subtle hardware problem, for example 
> cable problems are  common with SATA drives.
> 
> If you're using Maxtor disks, you could try testing each disk 
> with their PowerMax utility, available for download on their web site.
> 
> It might be that your problem only occurs when multiple disks 
> are accessed at the same time.  You could:
>  - Try the above dd, but run it in the background with "dd <...> &"
> for multiple disks at the same time.
>  - Nuke the superblocks and create the array again, but this 
> time do a 'tail -f /var/log/messages | grep -v md:' before 
> you start, to check for any IDE messages you might have missed.
>  - Apply the patch that Neil Brown mentions in the 
> aforelinkedto thread and see if things start to become more clear.
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html