Agreed, playing with some of these settings appear to clear the problem up, for at least the cases in which I tend to trigger it. Much obliged for the help! Jim On 31 March 2010 10:12, Mark Knecht <markknecht@xxxxxxxxx> wrote: > On Tue, Mar 30, 2010 at 6:35 PM, Roger Heflin <rogerheflin@xxxxxxxxx> wrote: >> Jim Duchek wrote: >>> >>> Hi all. Regularly after a large write to the disk (untarring a very >>> large file, etc), my RAID5 will 'freeze' for a period of time -- >>> perhaps around a minute. My system is completely responsive otherwise >>> during this time, with the exception of anything that is attempting to >>> read or write from the array -- it's as if any file descriptors simply >>> block. > <SNIP> >> >> In /etc/sysctl.conf or with "sysctl -a|grep vm.dirty" check these two >> settings: >> vm.dirty_background_ratio 5 >> vm.dirty_ratio = 6 >> >> Default will be something like 40 for the second one and 10 for the first >> on. >> >> 40% is how much memory the kernel lets get dirty with write data, 10% or >> whatever the bottom number is, is once it starts cleaning it up how low it >> has to go before letting anyone else write again (ie freeze all writes and >> massively slow down reads) >> >> I set the values to the above, in older kernels 5 is the min value, newer >> ones may allow lower, I don't believe it is well documented what the limits >> are, and if you set it lower the older kernels silently set the value to the >> min internally in the kernel, you won't see it on sysctl -a check. So on >> my machine I could freeze for how long it takes to write 1% of memory out to >> disk, which with 8GB is 81MB which takes at most a second or 2 at >> 60mb/second or so. If you have 8G and have the difference between the two >> set to 10% it can take 10+ seconds, I don't remember the default, but the >> large it is the bigger the freeze will be. >> >> And these depends on the underlying disk speed, if the underlying disk is >> slower the time it takes to write out that amount of data is larger and >> things are uglier, and file copies do a good job of causing this. >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > Very interesting Roger. Thanks. > > I did some reading on a couple of web site and then did some testing. > I found for the sort of jobs I do that create and write data, as an > example compiling and installing MythTV, these settings have a big > effect on the percentage of time my system drops into these 100%wa, 0% > CPU type of states. The default setting on my system was 10/20 and > that tended to create this state quite a lot. 3/40 reduced it by > probably 50-75%, while 3/70 seemed to eliminate it until the end of > the build where the kernel/compiler is presumably forcing it out to > disk because the job is finishing. > > One page I read mentioned data centers using a very good UPS and > internal power supply and then running it at 1/100. I think the basic > idea is that if we lose power there should be enough time to flush all > this stuff to disk before the power completely drops out but up until > that time let the kernel take care of things completely. > > Experimentally what I see is that when I cross above the lower value > it isn't that nothing gets written, but more that the kernel sort of > opportunistically starts writing it to disk without letting it get too > much in the way of running programs, and then when the higher value > seems to get crossed the system goes 100% wait while it pushes the > data out and is waiting for the disk. I used the command > > grep -A 1 dirty /proc/vmstat > > to watch a compile taking place and looked when it was 100% > user/system and then also when it went to 100% wait. > > Some additional reading seems to suggest tuning things like > > vm.overcommit_ratio > > and possibly changing the I/O scheduler > > keeper ~ # cat /sys/block/sda/queue/scheduler > noop deadline [cfq] > > or changing the number of requests > > keeper ~ # cat /sys/block/sda/queue/nr_requests > 128 > > or read ahead values > > keeper ~ # blockdev --getra /dev/sda > 256 > > I haven't played with any of those. > > Based on this info I think it's worth my time trying a new RAID > install and see if I'm more successful. > > Thanks very much for your insights and help! > > Cheers, > Mark > > > > keeper ~ # vi /etc/sysctl.conf > > vm.dirty_background_ratio = 10 > vm.dirty_ratio = 20 > > keeper ~ # sysctl -p > > real 8m50.667s > user 30m6.995s > sys 1m30.605s > keeper ~ # > > > keeper ~ # vi /etc/sysctl.conf > > vm.dirty_background_ratio = 3 > vm.dirty_ratio = 40 > > keeper ~ # sysctl -p > > keeper ~ # time emerge -DuN mythtv > <SNIP> > real 8m59.401s > user 30m9.980s > sys 1m30.303s > keeper ~ # > > > keeper ~ # vi /etc/sysctl.conf > > vm.dirty_background_ratio = 3 > vm.dirty_ratio = 70 > > keeper ~ # time emerge -DuN mythtv > <SNIP> > real 8m52.272s > user 30m0.889s > sys 1m30.609s > keeper ~ #keeper ~ # vi /etc/sysctl.conf > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html