On Fri, Nov 13 2015, Andreas Klauer wrote: > On Thu, Nov 12, 2015 at 05:28:41PM -0500, Joshua Kinard wrote: >> running MD RAID5 and the XFS filesystem. I have /, /home, /usr, /var, >> and /tmp on separate partitions, each a RAID5 setup. > > Hi, sorry for butting in, > > I have the same issue, on a regular consumer Haswell i5 box, > with a setup very very similar to yours: > > 7x2TB disks, multiple partitions, for each: RAID-5, LUKS, LVM, XFS. > > The issue occurs during regular RAID check which I run daily > (different partition/RAID each day, so it's more like a > evenly distributed weekly check). > > I have an application that uses `find -size +100M` on a directory > tree with ~3k subdirs and ~6k files in total. It doesn't do anything > with the find result, it's purely informal. So no big data involved, > even though the files themselves aren't small. > > Yet, it's slooow. The following tests were on a completely idle box, > apart from a running RAID check on the same /dev/mdX device. > > Kernel 4.2.3, unpatched: > > real 0m53.555s > user 0m0.013s > sys 0m0.037s > > real 1m3.777s > user 0m0.013s > sys 0m0.037s > > real 1m3.453s > user 0m0.014s > sys 0m0.036s > > Kernel 4.2.3, reverted ac8fa4196d20: > > real 0m3.206s > user 0m0.010s > sys 0m0.030s > > real 0m0.450s > user 0m0.003s > sys 0m0.014s > > real 0m0.375s > user 0m0.003s > sys 0m0.012s > > I did echo 3 > /proc/sys/vm/drop_caches between each find. > For some reason, subsequent calls in the reverted kernel are > considerably faster regardless. In the original kernel it > stays slow... if I don't drop_caches, the time is 0.006s. > > I don't normally reboot (while a RAID sync or check is > running) but while switching between kernels I noticed > the shutdown was very slow also in the original kernel. > > Are small requests getting delayed a lot or something? Thanks for all the details and sorry for the delay. Are (either of) you able to test with this small incremental patch? When the md resync notices there is other IO pending, the old code would cause the resync to wait at least 500msec and possibly longer to get the overall resync speed below a threshold. Having the threshold fixed doesn't make sense when devices have such a wide range of speeds. The problem patch changes it to only wait until pending resync requests have finished. These means the wait is proportional to the speed of the devices, which makes more sense. The hope was that this would allow quite a few regular IO request to slip in the gap between resync requests so that regular IO would proceed reasonably quickly. Sometimes that worked, but obviously not for you. This patch adds an extra delay, still proportional to the speed of the devices, but with (hopefully) a lot more room for regular IO requests to get queued and handled. Thanks, NeilBrown diff --git a/drivers/md/md.c b/drivers/md/md.c index c0c3e6dec248..8a25cf6087ed 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -8070,8 +8070,10 @@ void md_do_sync(struct md_thread *thread) * Give other IO more of a chance. * The faster the devices, the less we wait. */ + unsigned long start = jiffies; wait_event(mddev->recovery_wait, !atomic_read(&mddev->recovery_active)); + msleep(jiffies_to_msecs(jiffies - start)); } } }
Attachment:
signature.asc
Description: PGP signature