Hi Shaohua, thanks for looking at this one. From what I understand, and from what I've been told, the issue is due to increased seek time on the disk head. In this particular case, it is caused by the non-sync and resync I/O being sequential and heavy. If the resync speed is more than the max resync speed, the resync process needs to sleep 500 milliseconds. In my testing, with max resync speed at default of 200K, i never saw the actual speed go above roughly 180K. I don't know whether this is because it is successfully sleeping, or because it never reaches the max, therefore never sleeping and causing the issue. Of course, if the underlying devices are SSDs, then none of this matters as seek times are not an issue. If we are to pursue fixing this individual issue with rotational disks, more are likely to crop up later with different workloads. The original even commit mentions non-sync I/O perf dropping off with certain loads... ac8fa4196d20 ("md: allow resync to go faster when there is competing IO."), and there is field analysis provided and talks of rolling back the patch here: https://www.spinics.net/lists/raid/msg51363.html (I tested the patch in the last message of that thread, but it did not help. You can see in this thread: https://marc.info/?l=linux-raid&m=152226380103292&w=2). So no, my patch does not address the specific issue of why this particular load type exposes the non-sync I/O drop, but large scale, that may be desirable for a few reasons. The original patch was written to take advantage of fast devices; rotating disks are not fast by any stretch of the imagination, especially when compared to SSDs. So, eliminating rotational disks from the code path aligns with the intent of the patch. In the commit header, the maintainer mentions 'other loads', so there are multiple loads that he saw this perf drop on, and certainly more in the field. At a minimum, this patch eliminates the issue as we see it, prevents a potential future need of rollback of the commit, and prevents having to fix and re-fix all the different loads the issue can potentially surface with. To me, it seems like a clear step forward, maybe even a permanent leave, but certainly not an "arbitrary 'fix'". That said, please, if you have ideas or suggestions, let us know. We're seeing this issue regularly now, and just trying to get our customers some relief while staying consistent with upstream if at all possible. Best Regards, John On Sun, Apr 1, 2018 at 7:12 PM, Shaohua Li <shli@xxxxxxxxxx> wrote: > On Thu, Mar 29, 2018 at 03:24:38PM -0400, John Pittman wrote: >> Commit ac8fa4196d20 ("md: allow resync to go faster when there is >> competing IO.") introduced performance gaining features that are >> very beneficial to solid state media. When the same features >> are used in an environment with high I/O and rotational media, >> extreme performance degradation of non-sync I/O can be seen. >> Rotational devices are still slow by comparison, so to avoid these >> performance issues, only apply the newer code path to non-rotational >> devices. > > Do we know the reason why this doesn't work for rotational media? I'd rather > understand the reason and fix the real problem instead of apply an arbitrary > 'fix'. > > Thanks, > Shaohua >> Signed-off-by: John Pittman <jpittman@xxxxxxxxxx> >> --- >> drivers/md/md.c | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/md/md.c b/drivers/md/md.c >> index 254e44e..00dc3c4 100644 >> --- a/drivers/md/md.c >> +++ b/drivers/md/md.c >> @@ -8496,11 +8496,13 @@ void md_do_sync(struct md_thread *thread) >> /((jiffies-mddev->resync_mark)/HZ +1) +1; >> >> if (currspeed > speed_min(mddev)) { >> - if (currspeed > speed_max(mddev)) { >> + if (currspeed > speed_max(mddev) || (!is_mddev_idle(mddev, 0) && >> + !blk_queue_nonrot(mddev->queue))) { >> msleep(500); >> goto repeat; >> } >> - if (!is_mddev_idle(mddev, 0)) { >> + if (!is_mddev_idle(mddev, 0) && >> + blk_queue_nonrot(mddev->queue)) { >> /* >> * Give other IO more of a chance. >> * The faster the devices, the less we wait. >> -- >> 2.7.5 >> -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html