Re: [PATCH] md: allow faster resync only on non-rotational media

John Pittman <jpittman@xxxxxxxxxx> · Tue, 3 Apr 2018 13:40:17 -0400

Hi Shaohua, thanks for looking at this one.  From what I understand, and from
what I've been told, the issue is due to increased seek time on the disk head.
In this particular case, it is caused by the non-sync and resync I/O being
sequential and heavy.  If the resync speed is more than the max resync speed,
the resync process needs to sleep 500 milliseconds.  In my testing, with max
resync speed at default of 200K, i never saw the actual speed go above roughly
180K.  I don't know whether this is because it is successfully sleeping, or
because it never reaches the max, therefore never sleeping and causing the
issue.

Of course, if the underlying devices are SSDs, then none of this matters as
seek times are not an issue.  If we are to pursue fixing this individual issue
with rotational disks, more are likely to crop up later with different
workloads.  The original even commit mentions non-sync I/O perf dropping off
with certain loads... ac8fa4196d20 ("md: allow resync to go faster when there
is competing IO."), and there is field analysis provided and talks of rolling
back the patch here:  https://www.spinics.net/lists/raid/msg51363.html (I
tested the patch in the last message of that thread, but it did not help.  You
can see in this thread: https://marc.info/?l=linux-raid&m=152226380103292&w=2).
So no, my patch does not address the specific issue of why this particular
load type exposes the non-sync I/O drop, but large scale, that may be desirable
for a few reasons.

The original patch was written to take advantage of fast devices; rotating
disks are not fast by any stretch of the imagination, especially when compared
to SSDs.  So, eliminating rotational disks from the code path aligns with the
intent of the patch.  In the commit header, the maintainer mentions 'other
loads', so there are multiple loads that he saw this perf drop on, and certainly
more in the field.  At a minimum, this patch eliminates the issue as we see it,
prevents a potential future need of rollback of the commit, and prevents having
to fix and re-fix  all the different loads the issue can potentially surface
with. To me, it seems like a clear step forward, maybe even a permanent leave,
but certainly not an "arbitrary 'fix'".  That said, please, if you have ideas
or suggestions, let us know.  We're seeing this issue regularly now, and just
trying to get our customers some relief while staying consistent with upstream
if at all possible.

Best Regards,
John

On Sun, Apr 1, 2018 at 7:12 PM, Shaohua Li <shli@xxxxxxxxxx> wrote:
> On Thu, Mar 29, 2018 at 03:24:38PM -0400, John Pittman wrote:
>> Commit ac8fa4196d20 ("md: allow resync to go faster when there is
>> competing IO.") introduced performance gaining features that are
>> very beneficial to solid state media.  When the same features
>> are used in an environment with high I/O and rotational media,
>> extreme performance degradation of non-sync I/O can be seen.
>> Rotational devices are still slow by comparison, so to avoid these
>> performance issues, only apply the newer code path to non-rotational
>> devices.
>
> Do we know the reason why this doesn't work for rotational media? I'd rather
> understand the reason and fix the real problem instead of apply an arbitrary
> 'fix'.
>
> Thanks,
> Shaohua
>> Signed-off-by: John Pittman <jpittman@xxxxxxxxxx>
>> ---
>>  drivers/md/md.c | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index 254e44e..00dc3c4 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -8496,11 +8496,13 @@ void md_do_sync(struct md_thread *thread)
>>                       /((jiffies-mddev->resync_mark)/HZ +1) +1;
>>
>>               if (currspeed > speed_min(mddev)) {
>> -                     if (currspeed > speed_max(mddev)) {
>> +                     if (currspeed > speed_max(mddev) || (!is_mddev_idle(mddev, 0) &&
>> +                                     !blk_queue_nonrot(mddev->queue))) {
>>                               msleep(500);
>>                               goto repeat;
>>                       }
>> -                     if (!is_mddev_idle(mddev, 0)) {
>> +                     if (!is_mddev_idle(mddev, 0) &&
>> +                                     blk_queue_nonrot(mddev->queue)) {
>>                               /*
>>                                * Give other IO more of a chance.
>>                                * The faster the devices, the less we wait.
>> --
>> 2.7.5
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html