On Thu, 01 Apr 2010 15:07:27 +0100 Max Eaves <max@xxxxxxxxxxxxxx> wrote: > Doug, > > Thank you very much for that; a great relief off my shoulders. > > You are right - there is a config file located in > /etc/sysconfig/raid-check. I've changed ENABLED to no. However there is real value in doing that check, at least occasionally. It catches latent read errors. You might want to run it only every couple of months, and you might want to wind down one of both of the /proc/sys/dev/raid/speed_limit_* numbers so there is minimal impact on your system. But not scrubbing at all is not advisable. NeilBrown > > Amazing - I've learnt something today. > > Thanks once again. > > Max > > On 01/04/10 14:49, Doug Ledford wrote: > > On 04/01/2010 09:23 AM, Max Eaves wrote: > > > >> Hi there, > >> > >> I hope this gets through....my first posting on this dist.list. > >> > >> I am running Centos 5.4 with a 2.6.18-164.15.1.el5 kernel (x86_64) > >> kernel using a rather "homebrew" backblaze system > >> (http://blog.backblaze.com/) system. > >> > >> The mdadm version is: mdadm - v2.6.9 - 10th March 2009 > >> > >> It uses a number of Silicon Image 3124 (sIL 3124) cards and a number of > >> multiplier port cards (sIL3132) to read a large number of disks. > >> > >> I have 45 disks arranged into 3 mdadm raid sets of 15 disks. These 15 > >> disks are raided using RAID6. > >> > >> The problem I have is this: > >> > >> At random times, the RAID decides that it needs to resynchronise > >> /dev/md10 /dev/md11 and /dev/md12. There is no error or log event in > >> /var/log/messages, but the first thing I notice is that the performance > >> of the RAID array drops, and checking out "cat /proc/mdadm" shows all > >> three RAID re synchronising themselves. > >> > >> ARRAY /dev/md0 level=raid1 num-devices=2 > >> uuid=7d7b19e6:56cc90cc:3cb166bd:b8086f29 (system boot) (not a problem) > >> ARRAY /dev/md1 level=raid1 num-devices=2 > >> uuid=3782d93d:a491ffd4:f32c1014:94a2b3f7 (system LVM) (not a problem) > >> ARRAY /dev/md10 level=raid6 num-devices=15 > >> uuid=5ca86e2a-3b86-4c0b-9a7a-59143bdcd0f1 (partition 1) (problem) > >> ARRAY /dev/md11 level=raid6 num-devices=15 > >> uuid=61188c90-4825-44c5-8fac-9bc82a5799fe (partition 2) (problem) > >> ARRAY /dev/md12 level=raid6 num-devices=15 > >> uuid=fa939816-1d0f-4eaa-98dd-c131449c3921 (partition 3) (problem) > >> > >> These re-synchronisation events take about a week to complete (the RAID > >> is 18TB a pop) > >> > >> I know that the performance of this system is not great, but I wonder if > >> this resynchronisation is occurring because of some I/O time-out. > >> > >> Oddly enough, a restart of the server fixes the problem for a couple of > >> days, and then problem occurs again (humm - not good). > >> > >> I'm happy to post logs etc....just let me know what you need. > >> > > Disable /etc/cron.weekly/99-raid-check. They aren't resyncronizing, > > they are actually just checking themselves for consistency, but because > > the 2.6.18 kernel didn't have a different word for it in the output of > > /proc/mdstat it just looks that way. I can't remember if the version of > > mdadm in centos 5.4 has the /etc/sysconfig/raid-check config file, but > > if it does, it's easy to disable the weekly check there. > > > > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html