The patch titled md: increase the delay before marking metadata clean, and make it configurable has been added to the -mm tree. Its filename is md-increase-the-delay-before-marking-metadata-clean-and-make-it-configurable.patch See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find out what to do about this From: NeilBrown <neilb@xxxxxxx> When a md array has been idle (no writes) for 20msecs it is marked as 'clean'. This delay turns out to be too short for some real workloads. So increase it to 200msec (the time to update the metadata should be a tiny fraction of that) and make it sysfs-configurable. Signed-off-by: Neil Brown <neilb@xxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxx> --- Documentation/md.txt | 9 ++++++ drivers/md/md.c | 54 +++++++++++++++++++++++++++++++++++++++-- 2 files changed, 61 insertions(+), 2 deletions(-) diff -puN Documentation/md.txt~md-increase-the-delay-before-marking-metadata-clean-and-make-it-configurable Documentation/md.txt --- devel/Documentation/md.txt~md-increase-the-delay-before-marking-metadata-clean-and-make-it-configurable 2006-04-30 22:41:19.000000000 -0700 +++ devel-akpm/Documentation/md.txt 2006-04-30 22:41:19.000000000 -0700 @@ -207,6 +207,15 @@ All md devices contain: available. It will then appear at md/dev-XXX (depending on the name of the device) and further configuration is then possible. + safe_mode_delay + When an md array has seen no write requests for a certain period + of time, it will be marked as 'clean'. When another write + request arrive, the array is marked as 'dirty' before the write + commenses. This is known as 'safe_mode'. + The 'certain period' is controlled by this file which stores the + period as a number of seconds. The default is 200msec (0.200). + Writing a value of 0 disables safemode. + sync_speed_min sync_speed_max This are similar to /proc/sys/dev/raid/speed_limit_{min,max} diff -puN drivers/md/md.c~md-increase-the-delay-before-marking-metadata-clean-and-make-it-configurable drivers/md/md.c --- devel/drivers/md/md.c~md-increase-the-delay-before-marking-metadata-clean-and-make-it-configurable 2006-04-30 22:41:19.000000000 -0700 +++ devel-akpm/drivers/md/md.c 2006-04-30 22:41:19.000000000 -0700 @@ -43,6 +43,7 @@ #include <linux/suspend.h> #include <linux/poll.h> #include <linux/mutex.h> +#include <linux/ctype.h> #include <linux/init.h> @@ -1968,6 +1969,54 @@ static void analyze_sbs(mddev_t * mddev) } static ssize_t +safe_delay_show(mddev_t *mddev, char *page) +{ + int msec = (mddev->safemode_delay*1000)/HZ; + return sprintf(page, "%d.%03d\n", msec/1000, msec%1000); +} +static ssize_t +safe_delay_store(mddev_t *mddev, const char *cbuf, size_t len) +{ + int scale=1; + int dot=0; + int i; + unsigned long msec; + char buf[30]; + char *e; + /* remove a period, and count digits after it */ + if (len >= sizeof(buf)) + return -EINVAL; + strlcpy(buf, cbuf, len); + buf[len] = 0; + for (i=0; i<len; i++) { + if (dot) { + if (isdigit(buf[i])) { + buf[i-1] = buf[i]; + scale *= 10; + } + buf[i] = 0; + } else if (buf[i] == '.') { + dot=1; + buf[i] = 0; + } + } + msec = simple_strtoul(buf, &e, 10); + if (e == buf || (*e && *e != '\n')) + return -EINVAL; + msec = (msec * 1000) / scale; + if (msec == 0) + mddev->safemode_delay = 0; + else { + mddev->safemode_delay = (msec*HZ)/1000; + if (mddev->safemode_delay == 0) + mddev->safemode_delay = 1; + } + return len; +} +static struct md_sysfs_entry md_safe_delay = +__ATTR(safe_mode_delay, 0644,safe_delay_show, safe_delay_store); + +static ssize_t level_show(mddev_t *mddev, char *page) { struct mdk_personality *p = mddev->pers; @@ -2423,6 +2472,7 @@ static struct attribute *md_default_attr &md_size.attr, &md_metadata.attr, &md_new_device.attr, + &md_safe_delay.attr, NULL, }; @@ -2695,7 +2745,7 @@ static int do_md_run(mddev_t * mddev) mddev->safemode = 0; mddev->safemode_timer.function = md_safemode_timeout; mddev->safemode_timer.data = (unsigned long) mddev; - mddev->safemode_delay = (20 * HZ)/1000 +1; /* 20 msec delay */ + mddev->safemode_delay = (200 * HZ)/1000 +1; /* 200 msec delay */ mddev->in_sync = 1; ITERATE_RDEV(mddev,rdev,tmp) @@ -4581,7 +4631,7 @@ void md_write_end(mddev_t *mddev) if (atomic_dec_and_test(&mddev->writes_pending)) { if (mddev->safemode == 2) md_wakeup_thread(mddev->thread); - else + else if (mddev->safemode_delay) mod_timer(&mddev->safemode_timer, jiffies + mddev->safemode_delay); } } _ Patches currently in -mm which might be from neilb@xxxxxxx are md-avoid-oops-when-attempting-to-fix-read-errors-on-raid10.patch md-fixed-refcounting-locking-when-attempting-read-error-correction-in-raid10.patch md-change-enotsupp-to-eopnotsupp.patch md-improve-detection-of-lack-of-barrier-support-in-raid1.patch md-fix-rdev-nr_pending-count-when-retrying-barrier-requests.patch fix-dcache-race-during-umount.patch fix-dcache-race-during-umount-fix.patch prune_one_dentry-tweaks.patch remove-softlockup-from-invalidate_mapping_pages.patch make-address_space_operations-invalidatepage-return-void-reiser4.patch md-reformat-code-in-raid1_end_write_request-to-avoid-goto.patch md-remove-arbitrary-limit-on-chunk-size.patch md-remove-useless-ioctl-warning.patch md-increase-the-delay-before-marking-metadata-clean-and-make-it-configurable.patch md-merge-raid5-and-raid6-code.patch md-remove-nuisance-message-at-shutdown.patch md-allow-checkpoint-of-recovery-with-version-1-superblock.patch md-allow-a-linear-array-to-have-drives-added-while-active.patch md-support-stripe-offset-mode-in-raid10.patch md-make-md_print_devices-static.patch md-split-reshape-portion-of-raid5-sync_request-into-a-separate-function.patch md-dm-reduce-stack-usage-with-stacked-block-devices.patch - To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html